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13  ABSTRACT 


This  report  describes  -the-jEirst  six  months  of  a  program  of  applied  research  and  devel¬ 
opment  whose  purpose  is  to  explorer  the  practical  implications  and  potential  uses  of 
computer  technology  in  comprehensive  military  planning.  The  program,  called  Computer- 
Assisted  Planning,  is  building  on  earlier  work  in  Computer-Aided  Command  that  focused 
on  developing  a  prototype  military  computer  utility  (the  ADEPT-50  system)  and  on  using 
that  facility  as  a  tool  for  exploring  the  planning  needs  of  military  commanders. 

The  immediate  goal  of  the  Computer-Assisted  Planning  program  is  to  investigate  ’’he  im¬ 
pact  of  improved  communications  on  the  procurement  and  use  of  computers  by  Department 
of  Defense  planners.  "Communications,"  in  this  context,  includes  both  communication 
Detween  the  planner  and  his  computer  resources,  and  communication  among  computer  fa¬ 
cilities,  especially  those  that  are  elements  of  a  computer  network.  The  progranMias 
two  major  objectives.  The  first  is  to  provide  military  planners  with  models  and  pro¬ 
cedures  that  will  assist  tnem  both  in  strategic  and  tactical  planning  and  in  planning 
the  acquisition  and  use  of  computation  and  communications  resources.  The  second  is  to 
develop  technologies  and  procedures  that  will  enable  planners  to  interface  with  their 
computers  directly,  through  ordinary  larguage  and  notational  systems. 

Progress  is  described  in  three  major  areas  of  study:  (1)  Computation  and  Communication 
Tradeoffs  Study  (CACTOS);  (2)  Natural  Computer  Input/Output;  and  (3)  Systems  Research 
and  Interactive  Systems.  Plans  for  work  during  the  remaining  six  months  of  the  con¬ 
tract  period  are  presented. 
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1.  OVERVIEW 


This  report  describes  the  first  six  months  (from  16  September  1970  to  15  March 
1971)  of  a  program  of  applied  research  and  development  whose  purpose  ip  to 
explore  the  practical  implications  and  potential  uses  of  computer  technology 
in  comprehensive  military  planning.  The  program,  called  Computer-Assisted 
Planning,  is  building  on  earlier  work  in  Computer-Aided  Command  that  was 
focused  on  developing  a  prototype  military  computer  utility  (the  ADEPT-50 
system)  and  on  using  that  facility  as  a  tool  for  exploring  the  planning  needs 
of  military  commanders. 

1.1  GOALS  AND  OBJECTIVES 

Research  eove^ad  by  this  contract  is  directed  toward  the  long-range  goal  of 
substantially  improving  understanding  of  the  strategic  planning  process, 
with  the  eventual  objective  of  embodying  that  understanding  in  an  experi¬ 
mental,  prototype  computer-based  planning  system.  The  system  envisioned 
would  integrate  old  and  new  technologies,  permitting  reams  of  Department  of 
Defense  (DoD)  planners  to  interact  with  each  other,  with  computer-based  data 
bases,  and  with  analysis  tools  via  natural  communications  in  order  to  achieve 
planning  objectives  more  rapidly  or  with  higher  quality. 

The  immediate  goal  of  the  Computer-Assisted  Planning  (CAP)  program  is  to 
investigate  the  impact  of  improved  communications  on  the  use  of  computers  by 
DoD  planners.  ’’Communications in  this  context,  includes  both  communication 
between  the  planner  and  his  computer  resources,  and  communication  among 
computer  facilities,  especially  those  that  are  elements  of  a  computer  network. 
The  CAP  program  has  three  major  objectives:  (1)  to  provide  military  planners 
with  models  and  procedures  that  will  assist  them  both  in  strategic  and 
tactical  planning  and  in  planning  the  acquisition  and  use  of  computation  and 
communications  resources;  (2)  to  develop  technologies  and  procedures  that 
will  enable  planners  to  interface  with  their  computers  directly,  through 
ordinary  language  and  notational  systems;  and,  (S)  to  develop  the  computer 
systems  technologies  needed  to  construct  future  planning  systems. 

1.2  PROGRAM  HIGHLIGHTS 

The  work  reported  herein  covers  three  areas:  Computation  and  Communication 
Tradeoff  Studies  (CACTOS) ;  Natural  Computer  Input  and  Output;  and  Systems 
Research. 

CACTOS 


CACTOS  is  examining  the  1975-80  DoD  requirements  for  computation  and 
communication  resources,  with  particular  attention  to  the  tradeoffs  between 
concentrated  anu  distributed  computation  power.  Work  divides  into  development 
of  a  computer  network  analysis  mode.1,  and  the  definition  and  construction  of  a 


15  April  1S71 


2 


System  Development  Corporation 
TM-3628/008/00 


real-world  planning  data  base  on  ;;hicb  to  exercise  the  model.  Progress  was 
made  in  both  areas  with  the  selection  of  the  Marine  Corps  Personnel  Network 
system  for  study.  This  non-trivial,  non-classif ied  system  provides  the 
practical  context  for  model  and  data  base  construction,  which  is  in  progress. 
Earlier,  a  prototype  network  analysis  model  was  built  and  tested  against  the 
ARPA  Network  both  as  it  now  exists  and  as  it  might  be  reconfigured.  Interest¬ 
ing  results  are  reported  that  suggest  modifications  to  the  ARPA  Network 
topology. 

Natural  Computer  Input  and  Output 

This  research  is  directed  toward  integrating  the  interface  between  man  and 
machine  to  make  their  transactions  as  "human"  as  possible — through  English, 
both  by  keyboard  and  by  voice,  and  through  two-dimensional  mathematics  and 
drawings. 

CONVERSE  version  V-0,  a  significant  milestone  for  an  English-language  Data 
Management  System  (EDMS),  was  demonstrated  during  this  period.  Version  0 
is  the  first  of  a  new  series  cf  such  EDMS*s.  It  translates  into  a  data 
management  language  English  questions  of  moderate  complexity,  using  only 
surface-structure  information  and  data  from  a  small  concept  network. 

Successor  versions  V-l  and  V-2,  currently  under  development,  employ  new 
surface-structure  and  deep-structure  parsers,  a  richer  data  base,  a  capability 
for  recognizing  and  processing  declarative  and  imperative  sentences  (in 
addition  to  interrogative  sentences)  ,  and  user-feedback  and  language-extension 
facilities. 

Years  of  ;/ork  in  hand-printed  character  recognition  and  two-dimensicnal 
graphic  input/output  came  together  during  this  period  with  the  impressive 
demonstration  of  The  Adding  Machine  (TAM).  Operationally,  TAM  can  be 
viewed  as  a  "reactive  blackboard"  on  which  the  user  prints  arithmetic 
expressions  that  the  computer  reacts  to  by  printing  back  correct  evaluations. 

The  level  of  accomplishment  includes  a  powerful  set  of  operators;  fixed, 
floating-point,  and  array  variables;  looping;  and  both  built-in  and  user- 
defined  functions.  Extensions  to  TAM  to  enable  symbolic  processing  operators 
are  being  designed,  with  the  goal  of  achieving  a  fully  interactive  computational 
language. 

This  past  half  year  of  our  Voice  I/O  research  was  devoted  to  laboratory 
development  and  planning  strategies  for  solving  the  continuous  speech 
problem.  We  have  selected  an  incremental  approach  leading  to  a  Voice- 
CONVERSE  system.  Starting  from  a  Vicens-Reddy  (V-R)  base,  we  are  aiming 
toward  an  intermediate  accomplishment — a  Voice  Data  Management  System  (VDMS) 
of  linguijtic  complexity  matching  such  current  data  management  systems  as 
TDMS  or  DS/2.  Highlights  of  our  program  to  date  include  the  completion  of 
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our  voice  laboratory  built  around  the  Raytheon  704  computer  for  signal 
processing  of  acoustic  information.  The  reimplementation  of  the  V-R  system 
is  nearly  complete,  and  significant  analyses  of  the  V-R  algorithms  and 
heuristics,  not  heretofore  clearly  understood,  have  been  published. 

Extensions  and  improvements  to  V-R  are  continuing  in  a  system  called  TWIPER — 
our  base  for  future  VDMS  progress. 

Systems  Research 

Our  efforts  in  systems  research  are  toward  the  practical  realization  of  the 
computer-system  technologies  that  a  comprehensive  CAP  system  will  require. 
During  this  reporting  period,  our  time-sharing  system,  ADEPT,  was  successfully 
moved  from  the  IBM  360/50  computer  to  the  IBM  360/67.  The  move  aided  our 
research  significantly  by  providing  a  compatible,  reliable  executive, 
expanded  to  support  faster  terminals  (up  to  300-baud)  and  a  LISP  1.5  system 
that  has  been  expanded  to  85  pages  (approximately  348,000  bytes)  of  resident 
core  memory. 

A  significant  ARPA  Network  milestone  was  reached  as  this  document  went  to 
press:  a  test — SDC  to  RAND— of  interprocess  communication.  Complete  network 
operation  (TELNET)  will  be  ava: lable  this  summer  as  we  complete  our  time¬ 
sharing  Network  program  (HOSTCSS)  and  retrofit  existing  routines  to  comply 
with  recent  HOST-to-HOST  protocol  changes.  The  distributed  data  base  study 
has  concluded  that  network  data  management  can  best  be  achieved  through  the 
integration  of  local  Node  Data  Management  Systems  (NDMS)  with  a  common 
network  data  management  language  and  appropriate  local  interfaces.  English 
is  proposed  as  that  common  network  language,  and  a  single,  central-node, 
C0NVERS E-like  translator  system  is  being  advanced  to  translate  English 
queries  into  each  NDMS  form. 

Theoretical  work  on  the  use  of  graphs  in  global  program  optimization  was 
completed  and  documented  this  period  as  Part  I  of  a  two-part  text  entitled 
"A  Mathematical  Theory  of  Global  Program  Analysis."  Part  II,  dealing  with 
practlca.  applications,  is  in  preparation;  the  complete  text  will  be 
finished  during  this  contract  year.  A  practical  demonstration  of  this 
research  is  embodied  in  the  construction  of  a  benchmark  FORTRAN  IV  compiler. 
Pass  I  of  the  three-pass  compiler  has  been  designed,  ceded,  and  is  in  check¬ 
out;  passes  II  and  III  are  m  design. 

Gahu,  the  model  of  man-machinc.  cooperative  problem  solving,  is  coming  to 
grips  with  the  organizational  problems  needed  to  build  a  complex  planning 
system.  Gaku  is  evolving  incrementally  as  a  collection  of  rules  written  in 
the  User-Adaptive  Language  (UAL)  specifically  designed  for  stating  such 
complex  problems.  During  the  reporting  period,  major  advances  were  made  in 
the  construction  (in  SDC  LISP)  of  a  basic  UAL.  A  modestly  sophisticated 
demonstration  of  man-machine  interaction  is  now  possible.  To  achieve  the 
level  of  progress  reported  herein  for  both  UAL  and  CONVERSE,  a  number  of 
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improvements  were  required  to  SBC  LISP.  Most  significant  was  the  dramatic 
(quadrupling)  of  LISP  user  data  space  made  possible  by  the  85-page  LISP 
under  ADEPT  on  the  IBM  360/67. 

1.3  ORGANIZATION  OF  REPORT 

The  body  of  this  report  describes  in  detail  the  projects  devoted  to  the  three 
main  areas  of  investigation:  Computation  and  Communication  Tradeoff  Studies 
(section  2),  Natural  Computer  Input  and  Output  (section  3),  and  Systems 
Research  (sections  4  and  5).  Each  section  includes  a  detailed  description 
of  the  projects  pursued,  the  professional  staff,  and  the  technical  publi¬ 
cations  produced  during  the  contract  period.  The  goals,  problems,  successes, 
progress,  and  status  for  the  past  six  months  are  described  for  each  project. 
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2 .  COMPUTATION  AND  COMMUNICATION  TRADEOFF  STUDY  (CACTOS) 

The  goal  of  the  Computation  and  Communication  Tradeoff  Study  is  to  determine 
specific  DoD  requirements  for  computation  and  communication  networks  on  a 
regional,  functional,  and  categorical  basis  for  the  1975-80  time  period. 

With  the  continuing  advance  of  technology  in  computer  hardware,  software, 
and  communications,  it  is  important  that  an  overall  analysis  be  done  of  DoD 
requirements  in  these  areas.  Implicit  in  such  an  analysis  are  the  tradeoffs 
between  various  characteristics  of  computer  and  communication  networks,  which 
must  be  examined  several  years  prior  to  the  procurement  of  equipment  and 
facilities,  and  which  must  be  related  not  only  to  cost  and  time,  but  to  each 
other.  Such  a  tradeoff  study  must  be  at  least  in  part  quantitative  and  be 
capable  of  wide  usage  so  as  to  relate  to  specific  network  configurations 
within  DoD. 

To  achieve  this  goal,  the  following  objectives  have  been  identified: 

1.  Determine  amounts  of  data  processed,  stored,  and  retrieved, 
including  response  time  and  tradeoffs  between  security,  vulnera¬ 
bility,  throughput,  reliability,  cost,  and  time. 

2.  Construct  analytic  models  for  evaluating  and  modifying  selected 
netwo  rks . 

3.  Perform  tradeoff  studies  based,  in  part,  on  the  results  of  the 
model  analysis. 

4.  Validate  tne  analysis  using  an  existing  military  network. 

5.  Describe  future  technology  in  areas  relating  to  computers  and 
communicat ions . 

During  the  past  six  months,  effort  was  spent  in  (1)  delineating  the  descrip¬ 
tors  of  a  network,  (2)  establishing  the  ingredients  of  the  data  base,  and 
(3)  selecting  a  validation  DoD  network.  Effort  was  also  directed  at 
obtaining  state-of-the-art  forecasts  of  technology  for  central  processing 
units,  memories,  operating  systems,  control  and  display  devices  (terminals), 
communication  channels  and  services,  modems,  switches,  concentrators,  and 
software. 

An  analysis  was  conducted  of  what  key  computation  and  communication  charac¬ 
teristics  of  networks  would  apply  generally,  be  available  from  analyzing 
data,  and  permit  the  comparison  of  networks.  As  this  was  done,  it  became 
clear  that  two  general  efforts  were  needed.  The  first  is  that  of  developing 
and  adapting  a  network  model,  and  the  second  is  that  of  selecting  an 
appropriate  data  base  for  use  in  the  initial  analysis.  These  efforts  are 
described  separately  in  the  following  pages.  It  should  be  noted  here  that 
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the  search  for  an  appropriate  data  base  resulted  in  the  selection  of  the 
Marine  Corps  Personnel  Network,  a  non-trivial,  unclassified,  and  well- 
documented  system.  The  Marine  Corps  is  providing  us  with  considerable  data 
and  cooperation. 

2.1  MODEL  DEVELOPMENT  PROJECT 

2.1.1  Progress 

The  model  development  had  as  its  initial  goal  the  construction  of  a  prototype 
network  analysis  model.  This  has  been  achieved  and  is  described  below. 

The  model  operates  on  line  under  TS/DMS*  and  ADEPT  at  SDC.  It  is  programmed  ^ 
in  FORTRAN  IV  and  operates  within  the  framework  of  the  SDC  program,  DESIGNET. 
The  model  inputs  consist  of  network  configurations  (nodes  and  links),  node 
and  link  characteristics  (processing  capabilities  and  costs),  and  the  sizes, 
arrival  rates,  sources,  and  destinations  of  messages  and  jobs.  The  network 
configuration  is  entered  either  on  line,  by  individual  links  or  node 
connectivities,  or  by  selection  from  a  data  base.  A  job-arrival-rate  matrix 
is  specified,  and  message  sizes,  job  sizes,  and  link  lengths  may  be  given 
either  specific  matrices  or  average  values  for  the  net.  Job  processing 
rates  are  given  for  each  node,  and  channel  capacities  may  be  specified 
either  for  each  channel  or  for  the  net  as  a  whole.  (If  a  total  channel 
capacity  is  specified,  the  model  will  compute  an  optimal  allocation  for  each 
channel  to  match  the  message  traffic.)  Each  of  these  inputs  may  be  specified 
on  line  in  a  conversational  mode  or  selected  from  a  data  base.  The  inputs 
to  the  model  are  saved  and  can  b^  modified  on  line  by  the  user,  who  can 
add  or  delete  nodes  and  links,  change  traffic  characteristics,  and  run  the 
model  in  an  iterative  fashion  to  find  optimal  conditions. 

The  outputs  from  the  model  consist  of  a  network  analysis,  a  performance 
analysis,  and,  if  desired,  a  traffic  analysis  for  either  nodes  or  links  or 
both.  The  model  will  also  produce  a  graph  of  any  two  performance  variables 
(e.g.,  response  time  and  cost)  plotted  against  one  another.  The  network 
analysis  consists  of  a  statement  of  the  numbers  of  nodes  and  links,  the 
link-to-node  ratio  (assuming  full-aupiex  lines) ,  the  variance  (an  indication 

of  the  "dumpiness"  of  the  network)  ,  the  radius  (the  "shortest  longest" 
path  between  any  two  noaes) ,  the  diameter  (the  shortest  path  between  the  two 
most  distant  nodes) ,  the  number  of  articulation  points  of  order  cne  (a  point 
that,  if  deleted,  would  break  the  net  into  two  subnets),  and  the  number  of 


*TS/DMS  is  a  commercial  time-sharing  system  marketed  by  SDC. 

ick 

DESIGNET  ie  a  proprietary  SDC  software  package  for  analyzing  the  optimum 
distribution  of  resources  in  an  interconnected  commodities  network. 
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circuits  of  length  three.  The  performance  analysis  consists  of  the  average 
path  length,  the  mean  communication  response  time,  the  mean  computation 
response  time,  the  total  response  time,  and  the  total  cost.  The  traffic 
analysis  gives  traffic,  capacity,  delay  time,  and  cost  for  each  link,  and 
traffic,  delay,  and  cost  for  each  node. 

Network  characteristics  can  be  treated  as  variables  and  manipulated  to 
determine  the  network  performance  and  cost  effects  of  changing  conditions 
(e.g.,  anticipated  growth  or  shifts  in  processing  requirements),  or  to 
evaluate  changes  to  existing  or  proposed  networks.  Some  of  the  variables 
chat  may  be  manipulated  for  given  network  configurations  and  traffic  densities 
are: 

1.  Load  Conditions.  The  model  probes  areas  of  response  sensitivity 
for  various  combinations  of  job  and  message  arrival  tates,  joD 
and  message  sizes,  and  traffic  patterns. 

2.  Computation  and  Communication  Capacities.  The  effects  of  manip¬ 
ulating  node  and  link  processing  capacities  can  be  evaluated. 

Relative  bottlenecks  and  alternative  routings  can  be  created  by 
adjusting  capacities  and  by  adding  and  deleting  links  and  nodes. 

3  Network  Vulnerability.  The  network’s  vulnerability  to  accidental 
or  deliberate  destruction  of  nodes  and  links,  and  the  corresponding 
degradation  in  performance,  can  be  evaluated.  Finding  network 
weaknesses  presents  a  clear  challenge  to  network  designers,  as 
does  minimizing  degradation  of  performance  under  a  hostile 
environment . 

A.  Network  Topolc&i^>3  Characteristics.  The  effects  of  changing  such 
topological  characteristics  as  the  radius,  diameter,  link-to-node 
latio,  and  variance  can  be  evaluated.  Determining  direct  relation¬ 
ships  between  topological  characteristics  and  performance  would 
lead  to  principles  for  creating  efficient  networks. 

5.  Distributed  Intelligence.  The  effects  of  the  relative  centralization 
versus  the  relative  dispersion  of  logic  (computing  capacity)  and 
information  stores  can  be  evaxuated.  The  effects  of  advancing 
technology — for  example,  drastic  reductions  in  logic,  storage,  and 
transmission  costs — on  the  relative  cost-effectiveness  of  different 
distributions  of  intelligence  is  a  major  tradeoff  being  considered 
in  CACTOS. 
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There  are^certain  assumptions  nd  capability  limitations  of  the  present  model. 
They  are: 

Assumptions 

1.  The  node  delay  of  .001  seconds  is  constant  for  each  message — based 
on  Dr.  Kleinrock’s  (UCLA)  observations  of  the  ARPA  Network. 

2.  Node-traffic  capacities  are  infinite. 

3.  Message  buffers  are  infinite — based  on  Dr.  Frank’s  (NAC)  contention 
that  this  holds  for  lxnes  with  80%  or  less  utilization. 

4.  Poisson  statistics  apply  to  interarrival  times  and  message  lengths. 

5.  Arrival  statistics  are  independent  of  message  lengths. 

6.  There  are  no  acknowledgement  messages  (to  be  relaxed). 

V.  There  is  no  division  of  messages  into  packets  (to  be  relaxed). 
Limitations 


1.  All  switching  is  store  and  forward  (to  be  generalized). 

2.  Nodes  are  connected  by  full-duplex  lines  (to  be  generalized  to 
a.’ low  for  simplex  lines.) 

3.  Channel  capacities  are  assigned  on  the  basi.^  of  a  fixed  sum  of 
capacities  **ather  than  by  cost  (to  change  ar.d  allow  for  cost)  . 

4.  Fixed-ninimum-linK.ed  routing  is  used  (this  has  been  shown  to  be 
neai'y  optimal  by  NAC) . 

5.  Computer  throughput  is  designated  by  a  single  numbs^,  the  number  of 
megabits  (Mb)  modified/sec. 

6.  All  transactions  are  of  equal  priority  (to  be  modified). 

The  ARPA  "et  rk  as  shown  in  Figure  2-1  was  explored  with  the  prototype  model. 

The  specific  objectives  being  -/.plored  were  a  dc  rease  in 


The  comments  in  parentheses  refer  to  planned  near-term  changes  to  the  model. 


Figure  2-1.  ARPA  Network  (with  modifications) 
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(a)  Job  Arrival  Matrix  (Number  of  Jobs  per 

Day  Sent  from  Node  i  to  be  Processed  at  Node  j) 
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(b)  Computer  Processing  Power  (Throughput) 
of  15  Nodes  (in  modified  Mb/sec.) 

at  Each 

Figure  2-2.  Data  Input  for  ARPA  Network 
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bottlenecks,  and  a  significant  reduction  in  response  time,  by  decreasing  the 
average  message  path  length  traveled.  Input  data  consisted  of  the  job-arrival 
matrix  in  Figure  2-2 (a)  based  on  "congenial"  node  traffic  (the  (i,j)th  entry 
is  the  number  of  jobs  sent  from  node  i  t-j  nod»  j'  „  message  sizes  of 

70  kb  (set  large  to  make  up  for  the  low  job-ariival  ~ate£  and  the  lack  of 
packets  and  acknowledgements) ;  computing  center  throughputs  shown  in 
Figure  2-2 (b),  mean  job  sizes  to  be  processed  at  node  j  of  15  x  P.  where 
is  the  throughput  (modified  megabits/sec.)  of  the  computers  at  node  j;  andreal 
mileage  distances  betwen  nodes.  All  links  were  preset  to  50  kb. 

The  results  indicated  that  by  adding  six  links  (dashed  lines)  and  deleting 
two  existing  links  (dotted  lines),  the  modified  15-node,  23-link  network 
successfully  accomplished  the  stated  objectives.  The  mean  communication 
response  time  was  reduced  from  57.4  seconds  to  39.9  seconds.  The  average 
path  length  was  reduced  from  2.8  to  2.2  links  traversed,  and  the  diameter 
was  reduced  from  6  to  4.  Bottlenecks  were  diminished  from  a  worst  case  of 
83  jobs/day  (including  9  links  of  50  or  more)  to  a  worst  case  of  40  jobs/day. 
The  details  of  this  experiment  are  currently  being  documented  as  a  forth¬ 
coming  SDC  technical  memorandum. 

The  model  results  have  been  discussed  with  Dr.  Kleinrock  of  UCLA  and 
Dr.  Frank  of  NAC.  Both  felt  that  the  model  would  be  improved  by  breaking 
messages  into  packets  and  sending  acknowledgements,  and  this  will  be  done 
in  the  next  period. 

2.1.2  Plans 

Improvements  to  the  model  during  the  next  six  months  will  include,  in 
addition  to  the  changes  mentioned  in  parentheses  above,  the  following 
capabilities : 

1.  Storage  Capacity.  For  given  traffic  densities  and  processing 
capacities,  the  effects  of  limited  storage  capacity  on  network 
performance  in  general,  and  on  alternate  routing  and  load  balancing 
in  particular,  will  be  evaluated.  The  impact  on  network 
efficiencies  and  costs  of  very  large,  very  cheap,  very  fast 
memories  (say,  lO1^  bits  of  laser  or  bubble  memory,  such  as  are 
promised  by  advanced  technology)  will  also  be  a  tradeoff  of 
interest  to  CACTOS. 

2.  Error  Rates.  The  effects  of  modifying  error  rates  in  computing  and 
comruni cat ions  will  be  evaluated  for  various  configurations  and 
traffic  densities. 

Reliabilities .  The  effects  of  modifying  reliabilities  of  network 
components  will  be  evaluated  for  various  configurations  and  traffic 
densities. 


3. 
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The  tradeoff  analysis  will  be  based  largely  on  the  model  results.  Of  partic¬ 
ular  interest  during  the  next  six  months  will  be  studies  of  centralized 
versus  distributed  networks,  supemodes,  and  message  versus  circuit  switching. 
It  will  alsc  be  possible  to  get  graphic  analysis  of  performance  characteristics 
such  as  reliability,  vulnerability,  throughput,  and  security  in  a  time  and 
cost  framework. 

The  Marine  Corps  Personnel  Network  system  is  being  modeled  with  its  existing 
configuration  and  its  connections  to  AUTODIN.  Snapshots  of  the  sy  jtem, 
reflecting  the  yearly  growth  in  processing  requirements  over  the  years, 
will  be  taken  to  detect  evolutionary  changes  in  performance.  Alternate 
configurations,  including  dedicated  communication  trunks,  will  be  considered 
to  determine  their  impact  on  network  performance  and  costs.  Distributed 
intelligence  strategies  for  base-level  computers  may  be  considered. 

The  next  step  will  be  to  construct  an  interface  between  the  Marine  Corps 
data  base  and  the  model.  This  will  allow  the  user  to  access  files  tor 
information  on  nodes.  For  convenience,  this  access  will  be  called  an 
executive;  whether  it  will  be  on  line  or  batch  remains  to  be  determined. 

The  model* s  greatest  utilitv  is  in  an  interactive  mode,  and  on-line  access 
would  be  desirable,  but  may  require  more  development  effort  than  can  presently 
be  given  to  it.  The  situation  is  being  evaluated  now. 

Once  the  candidate  networks  are  evaluated  and  the  tradeoff  experimentation 
is  begun,  the  next  step  in  model  development  will  be  to  build  a  more 
general  analysis  model  that  can  be  used  to  analyze  large,  complex  networks, 
such  as  command  and  control  networks.  The  general  model  will  have  an 
executive  that  may  access  not  only  the  data  base,  but  modules  containing 
(for  example)  simulation  and  statistical  routines.  These  additional  modules 
will  not  be  identified  until  the  prototype  model  is  complete  and  some 
experience  has  been  gained  from  the  initial  network  analyses. 

2.2  DATA  BASE  DEVELOPMENT  PROJECT 

2.2.1  Progress 

The  initial  tasks  in  data  base  development  have  been  to  isolate  the  ingredients 
that  would  identify  the  central  parameters,  or  characteristics,  of  a  network 
and  to  determine  what  data  could  be  made  available  from  candidate  networks. 

The  results  of  these  tasks  are  summarized  below. 

A  portion  of  the  data  base  is  devoted  to  the  network  configuration,  and  is 
limited  to  the  basic  characteristics  of  nodes,  links,  and  traffic;  these 
basic  characteristics  will  be  adapted  and  enhanced  as  a  result  of  the 
model  analysis  of  the  candidate  networks.  Data  input  included  here  is 
Network  Identification,  Node  Record  Indicator,  Number  of  Nodes,  Node  Number, 
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Node  Name,  Node  Location  (Latitude,  Longitude),  Processor  Type,  Processor 
Number,  Link  Record  Indicator,  Number  of  Links,  Source  Node  Number,  Destina¬ 
tion  Node  Number,  and  Line  Number. 

Eventually,  with  the  general  model,  the  data  leaded  will  be  either  specified 
directly  to  the  model  as  job  and  traffic  matrices  or  loaded  into  the  data 
base  as  a  set  of  specifications  that  can  be  selected  and  applied  to  several 
network  configurations.  The  first  option  is  available  to  the  user  through 
the  prototype  model. 

Finding  a  candidate  DoD  network  was  more  difficult  than  first  anticipated. 
Logistics  and  command  and  control  networks  were  considered,  but  posed  the 
drawbacks  of  being  either  classified  or  in  procurement.  After  much  effort, 
a  candidate  was  found:  The  Marine  Corps  Personnel  Network,  which  comprises 
major  computers  in  Kansas  City  and  Washington,  D.  C. ,  and  satellite  computers 
in  Vietnam,  Hawaii,  Camp  Pendleton,  Camp  Lejeune,  and  Okinawa.  It  is  dynamic, 
in  part  because  of  the  scaling  down  of  the  Marine  Corps  effort  in  the  Far 
East,  and  analysis  over  time  is  therefore  possible.  The  satellite  computers 
process  command  and  control  information  as  well  as  personnel  data.  All 
computers  are  linked  through  AUTODIN.  Initial  statistical  data  on  messages, 
hardware,  and  AUTODIN  have  been  collected  from  the  Marine  Corps,  and  the  model 
network  is  now  in  the  process  of  being  ran. 

Although  not  directly  part  of  the  data  base  effort,  the  analysis  of  future 
technology  will  relate  to  it  and  to  the  modeling  of  future  configurations 
and  properties  of  networks.  A  series  of  white  papers  is  being  produced 
describing  various  developments.  One,  dealing  with  memory  organizations 
and  addressing,  will  soon  be  published;  others  are  being  written  on 
communications,  terminals,  peripherals,  and  other  areas  related  to  networks. 
These  will  be  published  during  the  next  reporting  period. 

2.2.2  Plans 

In  examining  the  future  of  the  data  base  development,  the  modal  development 
must  be  kept  in  view.  In  order  to  model  complex,  large-scale  networks,  we 
will  need  a  data  base  that  can  be  accessed  either  in  batch  or  on  line  by 
the  model  executive.  The  data  base  currently  envisioned  will  contain  data 
from  Auerbach’s  Computer  Characteristics  Digest  and  Data  Communication 
Reports,  as  well  as  data  drawn  from  the  ARPA  Network,  AUTODIN,  and  the  Marine 
Corps  network.  Ultimately,  it  would  be  desirable  to  allow  the  user  to  specify 
the  computer,  peripherals,  and  terminals  for  each  node  by  entering  a  simple 
code  command.  This  would  not  only  facilitate  the  initial  modeling,  but  be 
valuable  Tor  on-line  modification  of  the  network. 


ajb*- 
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The  following  milestones  are  expected  to  be  achieved  during  the  next  six 
months : 

1.  Modeling  of  the  Marine  Corps  Personnel  Network. 

2.  Completion  of  tradeoff  analysis. 

3.  Completion  of  technology  forecasts. 

4.  Initiation  of  general  data  base  and  model  development. 

Beyond  these  milestones  would  be  the  modeling  of  larger  networks,  analysis  of 
alternative  futures,  the  possible  incorporation  of  statistical  and  simulation 
tools  for  the  analysis  of  the  data  base,  and  interactive  data  base  (and  model) 
access  through  the  ARPA  Network. 

2.3  STAFF 

Dr.  B.  ?.  Lientz,  Principal  Invest igator 

Model  Development  Project 

Dr.  P.  L.  Citrenbaum,  Head 
G.  M.  Cady 

D.  R.  Lashier 

Data  Base  Development  Project 

Dr.  N.  E.  Willmorth,  Head 
L.  G.  Chesler 
D.  M.  Gun n  (part  time) 

R.  Mosier  (part  time) 

2.4  DOCUMENTATION 

Citrenbaum,  Ronald  L.  Analysis  of  Computation  and  Communication  Model. 

SDC  document  SP-3601.  In  publication. 

Gunn,  Donald  M.  Survey  of  Digital  Data  Communications.  SDC  document  SP-3603. 
In  publication. 

Mosier,  Robert.  Memories:  Addressing  and  Organization.  SDC  document  SP-3602. 
In  Publication. 
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3.  NATURAL  COMPUiER  INPUT /OUTPUT 


As  the  power  of  both  computer  hardware  and  computer  software  increases  and 
the  price  decreases,  the  computer  becomes  more  omnipresent  as  both  a  tool 
and  a  necessity  in  our  daily  lives.  As  the  expanding  circle  of  contact 
between  men  and  computers  continues  to  increase,  the  disparity  between  man 
and  computer  in  communications  capability  becomes  more  evident  and  less 
tolerable.  Therefore,  it  behooves  us  to  divert  some  of  the  available  power 
of  computer  systems  to  mediate  the  communication  gap  and  provide  computer 
input  and  output  systems  that  are  "natural"  and  acceptable  to  men.  Toward 
this  end,  the  Natural  Computer  Input /Output  task  is  providing  research  and 
development  in  computer  processing  and  semantic  interpretation  of  natural 
English,  hand-drawn  pictorial  and  symbolic  input,  computer-generated  images, 
speech  understanding  by  the  computer,  and  computer-synthesized  speech. 

The  Natural  Computer  Input/Output  work  is  divided  into  three  projects:  the 
CONVERSE  project  (the  development  of  an  English  data  management  system) ,  the 
Graphic  Input/Output  project  (recognition  and  utilization  of  hand-drawn  and 
hand-printed  input),  and  the  Voice  Input/Output  project  (speech  understanding 
and  synthesis).  The  interdependence  between  the  Voice  Input/Output  and 
CONVERSE  projects  is  both  obvious  and  natural,  and  (though  the  two  projects 
are  at  different  levels  of  attainment)  communication,  cooperation,  and 
commonality  of  basic  intent  are  uppermost  in  their  direction.  The  Graphic 
Input/Output  project's  primary  concerns  are  with  information  whose  content 
cannot  be  readily  conveyed  by  either  the  spoken  or  the  written  word,  but 
rather  through  pictorial  or  notational  conventions  best  portrayed  and  con¬ 
veyed  in  two  dimensions.  The  major  emphasis  of  this  work  h^s  been  on  hand¬ 
printed  input  and  computer-generated  output  of  mathematics,  developing 
applications  requiring  mathematical  notation,  and  extending  the  notational 
capability  into  other  domains. 

Taken  as  a  whole,  the  Natural  Computer  Input /Output  task  is  providing 
the  technological  basis  for  operational  man-machine  systems  for  which  the 
ultimate  end  user  will  require  little,  if  any,  special  training  in  computer 
science. 


3.1  CONVERSE:  AN  ENGLISH  DATA  MANAGEMENT  SYSTEM 

The  principle  goal  of  this  project  is  to  develop  promising  new  natural- 
language  processing  techniques  and  to  implement  an  experimental,  prototype 
computer  program  system,  based  on  those  techniques,  that  will  permit  users 
to  communicate  with  large  on  line  data  bases  in  ordinary  English. 

The  two  key  objectives  of  the  prototype  system  are  (1)  the  ability  to  recognize 
a  substantial  subset  of  English  and  (2)  the  ability  to  store  and  search 
large  quantities  of  conceptual  and  factual  information.  Particular  emphasis 
is  being  placed  on  versatility — on  developing  a  system  that  is  potentially 
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applicable  to  the  management  of  a  wide  variety  of  files,  includirg  large 
formatted  files,  files  of  complex  relational  data,  and  files  of  documentary 
information.  A  third  objective  is  the  development  and  demonstration  of 
language  processing  and  data  retrieval  techniques  that  are  more  powerful 
than  previously  available  techniques,  o~  more  efficient,  or  both. 

At  the  start  of  this  contract  year,  we  had: (1)  implemented  first  versions  of 
a  natural-language  compiler  that  recognized  English  surface  syntactic  struc¬ 
tures  and  produced  file-searching  procedures  in  a  formal  intermediate  language; 
(2)  constructed  a  data  management  system  that  accepted  these  procedures  and 
carried  out  specified  storage,  search,  and  computational  operations;  (3)  devel¬ 
oped  a  promising  new  approach  to  syntactic  recognition  in  which  all  appropriate 
deep  and  surface  structures  are  simultaneously  produced  (deep-structure 
representations  generalize  the  syntax-analysis  component  and  provide  canonical 
trees  upon  which  a  simplified  set  of  semantic  rules  can  operate  to  provide 
inteimediate-language  procedures);  ,4)  begun  the  programming  necessary  to 
extend  the  natural -language  compiler  to  produce  deep  structures;  (5)  begun 
exercising  the  data  management  system  with  an  initial  data  base  of  4,000 
facts;  (6)  implemented  functions  to  create  an  initial  concept  network  of 
semantic  information  used  for  disambiguation  during  input-sertence  parsing 
and  intermediate-language  generation;  and  (7)  made  a  start,  in  collaboration 
with  Dr.  L.  Travis  at  the  University  of  Wisconsin,  on  a  promising  new  approach 
to  large-scale  data  base  inference-making. 

The  single  impediment  to  further  progress  at  the  start  of  this  reporting  period 
was  the  severe  restriction  of  available  core-memory  space  in  Che  LISP 
programming  system  used  by  CONVERSE  on  the  IBM/ 360,  despite  our  efforts  to 
place  more  and  more  information  on  disc  in  an  efficiently  retrievable  form. 

(At  the  present  time  all  dictionary,  concept-net,  and  fact-file  data,  and 
most  grammar-rule  information,  are  stored  on  disc.)  Progress  on  both  the 
CONVERSE  system  and  the  LISP  system  has  been  realized.  We  report  on  CONVERSE 
here  and  on  LISP  in  section  5.3. 

3.1.1  Progress 

The  current  status  of  CONVERSE-360  is  described  in  a  paper  presented  at  the 
April,  1971,  ACM  Symposium  on  Information  Storage  and  Retrieval.  This  paper 
has  been  published  in  the  symposium  proceedings. 

During  this  past  quarter,  two  important  milestones  were  achieved; 

(1)  the  demonstration  of  a  limited-core-memory  version  of  CONVERSE  (version 
V-0)  and  (2)  the  realization  of  a  new  natural-language  compiler  in  the 
expanded  version  of  the  LISP  1.5  system.  Other  key  areas  in  which  progress 
has  been  made  include  data  management  and  intermediate  language,  deductive 
inference,  and  the  analysis  of  fundamental  syntactic  and  semantic  relations. 
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CONVERSE  V-0 


Despite  severe  core-memory  restrictions,  an  initial  demonstrable  system  was 
achieved  at  the  midpoint  of  this  reporting  period.  The  system  is  implemented 
in  two  46-page  copies  of  LISP  1.5  running  under  ADEPT/67.  The  system 
translates  questions  of  moderate  complexity  into  intermediate  language  using 
only  surface-structure  information  and  data  from  a  small  concept  network. 
Questions  are  ther  answered  by  searching  our  initial  data  base.  A  sample 
V-0  printout  is  shown  in  Figure  3-1. 

In  pushing  this  demonstration  milestone  to  a  successful  completion,  we  have 
solved  a  number  of  minor  but  time-consuming  problems  in  intercommunication 
between  the  two  LISP  programs  and  between  the  programs  and  their  associated 
disc  data  files.  All  of  the  checked-out  facilities  in  V-0  will  be  of  direct 
use  in  future,  larger-scale  versions  of  CONVERSE  (versions  V-l/V-2). 

85-Page  LISP  Natural-Language  Compiler 

The  second  major  milestone  was  achieved  late  in  the  reporting  period  when 
staff  members  completed  an  extensive  series  of  revisions  and  improvements 
to  our  LISP  programming  system.  The  most  important  result  of  this  effort 
was  the  ability  to  construct  85-page  LISP  programs.  (This  work  is  reported 
separately  in  section  5.3.) 

As  soon  as  the  new  large  LISP  became  available,  we  implemented  a  new  natural- 
language  compiler  with  facilities  for  producing  both  surface  and  deep  struc¬ 
tures.  As  the  reporting  period  came  to  a  close,  we  were  producing  a  number 
of  correct  surface  and  deep  structures  and  were  well  along  in  the  process 
of  writing  and  implementing  rules  to  produce  intermediate  language  from  deep 
structures. 

In  syntax,  progress  has  been  made  in  both  extending  the  scope  of  the  parser* s 
input  and  improving  the  quality  of  the  parser's  output.  The  range  of  English 
structures  that  can  be  given  a  «urface-structure  parsing  nas  increased  to 
include  uominalizations  and  sentential  complementation  (of  the  FOR-TO, 

THAT,  and  POSS-ING  varieties),  superlatives,  tag  questions,  and  verb  particle 
constructions.  By  revising  the  Structure  Building  (SB)  rules,  we  have 
continued  to  eliminate  undesired  surface  parsings  for  any  particular  string. 

Revising  the  SB  rules  has  also  improved  the  quality  of  the  deep  structures 
output.  This  end  has  also  been  achieved  by  (1)  debugging  the  operation  of 
the  Structure  Changing  (SC)  rules  (which  perform  such  basic  operations  as 
adjunction,  deletion,  and  replacement  on  trees)  and  (2)  adding  new  SC  rules 
that  reconstruct  elements  deleted  in  the  surface  string  from  propositions 
semantically  inherent  in  that  string.  The  latter  effort  naturally  results 
in  greater  success  in  representing  paraphrase,  relations.  Special  efforts 
have  been  made  to  represent  the  wide  variety  of  English  comparative  structures 
in  such  a  way  that  they  are  amenable  to  a  relatively  simple  semantic  treatment. 


UWAT  IS  THE  POPULATION  OF  CARDEN  PROVE? 
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Figure  3-1.  Question  Answering  in  CONVERSE  V-0 


WHICH  EMPLOYEES  SCHEDULED  TO  BE  RELOCATED  LAST  YEAR  ARE  STILL  EMPLOYED? 
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Figure  3-2  illustrates  at  least  two  points  about  deep  structure:  First,  the 
deep  structure  contains  in  explicit  form  information  that  is  only  implied  in 
the  surface  structure;  second,  the  deep  structure  has  a  canonical  form  for 
each  proposition. 

To  achieve  an  Interface  between  syntax  and  semantics,  i.e. ,  intermediate 
language,  a  set  of  Semantic  Interpretation  (SI)  rules  were  developed.  These 
rules  take  as  input  deep-structure  trees  and  produce  as  output  procedures 
in  intermediate  language.  The  SI  rules  are  treated  as  a  separate  component 
to  facilitate  revision  and  debugging  of  the  syntax  and  SI  rules  and  to 
accelerate  run  time  by  avoiding  unnecessary  semantic  work  on  syntactic 
structures  that  are  aborted  in  the  parsing  process. 

In  the  future,  the  semantic  rules  will  incorporate  case  assignment.  The 
case  framework  employed  by  CONVERSE  provides  a  strategy  for  mapping  each 
distinct  function  of  a  prepositional  phrase  and  noun  phrase  into  the  relevant 
semantic  categories  in  the  data  base.  Restricting  possible  case  relations 
to  those  that  might  be  encountered  in  a  practical  data  base  has  greatly 
simplified  the  choice  of  possible  cases. 

The  preliminary  task  of  extracting  a  relatively  comprehensive  set  of  distinct 
classes  of  prepositional  phrases  is  now  completed.  These  classes  form  the 
basis  of  a  first  list  of  cases.  Effort  in  this  direction  is  continuing  in 
two  steps:  (1)  examination  of  the  syntactic  and  semantic  information 
associated  with  each  prepositional  phrase  for  clues  concerning  its  case 
membership;  and  (2)  the  utilization  of  this  information  to  wiite  case- 
assignment  rules. 

Intermediate  language  and  Data  Management 

We  are  presently  working  out  a  number  of  operators  to  be  added  to  the  formal 
intermediate  language  (IL) .  These  changes  will  make  IL  easier  to  generate 
from  deep  structures  and  will  increase  its  expressive  power.  For  example, 
we  have  developed  operators  to  handle  quantification  (e.g.,  "Every  girl  was 
kissed  by  some  boy."),  compound  negation  (e.g.,  "No  Pole  knows  a  German 
whom  he  does  not  like."),  and  complex  comparisons  among  relations  (e.g., 
"Which  flights  depart  from  midwestem  cities  for  cities  further  East?"). 

A  new,  85-page  version  of  the  CONVERSE  data  management  system  has  been 
produced,  and  a  second  data  base  of  more  than  10,000  facts  has  been  construc¬ 
ted  for  more  extensive  exercising  of  future  versions  of  CONVERSE. 

Inference-making 

Work  on  an  inferential  system  for  CONVERSE  during  this  reporting  period  has 
focused  on  developing  specifications  for  the  deduction  grapher,  which  is  the 
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part  of  the  system  that  draws  inferences  from  general  facts.  Programming  of 
one  part  of  the  deduction  grapher  has  commenced,  and  the  design  of  a  driver 
language  is  underway.  This  driver  language  is  the  CONVERSE  intermediate 
language  extended  to  enable  its  use  for  specifying  deduct ion-grapher  strate¬ 
gies  and  operand  expressions. 

The  deduction  grapher  is  designed  to  make  inferences  in  a  question-answering 
system  rather  than  in  a  formal  mathematical  system.  The  difference  is  that, 
in  a  question-answering  system,  an  essential  (and  perhaps  the  most  difficult) 
part  of  successful  inference  is  the  selection  of  relevant  premises  from  a 
very  large  set  of  premises  (most  of  which  are  irrelevant  to  the  inference 
being  attempted) ,  while  in  a  formal  mathematical  system,  premise  selection 
does  not  present  a  problem.  On  the  other  hand,  showing  in  a  question¬ 
answering  system  that  selected  premises  have  needed  deductive  relationships 
to  each  other  does  not  r resent  the  problem  it  does  in  mathematical  inference 
because  the  relationships  are  likely  to  be  simple  ones  among  many  different 
predicates. 

The  deduction  grapher  proceeds  by  generating  proof  proposals,  progressively 
filling  out  the  details  of  these  proposals,  and  filtering  out  bad  proposals 
at  various  stages  along  the  way.  Bad  proposals  are  those  that  cannot  be 
filled  out  to  become  valid  proofs.  If  all  originally  generated  proposals 
are  filtered  out,  the  system  loops  back  to  generate  additional  ones. 

General  facts  are  stored  in  an  associative  network  called  the  premise  graph. 
This  graph  is  composed  of  formalized  assertions  explicitly  linked  together 
at  the  points  where  they  can  possibly  interact  with  each  other  deductively. 

As  with  mechanical  theorem-provers  base;1  on  the  resolution  principle,  the 
assertions  are  in  Skolemized,  quantifier-free  form,  but  the  deduction  grapher 
uses  primitive  conditionals  as  its  normal  form  rather  than  the  conjunctive 
normal  form  of  resolution  systems.  The  explicit,  premise-connecting  links 
in  the  premise  graph  represent  Prawitz -Robinson  unifications  of  literals, 
i.e.,  of  rhe  atomic  components  of  premises.  In  contrast  to  previous  automatic 
theorem  prcvers,  in  the  deduction  grapher  first-order  unifications  (i.e.,  uni¬ 
fications  between  literals  in  "original  clauses”)  are  discovered  when  a 
general  fact  (a  premise)  is  entered  into  the  system,  and  they  do  not  need 
to  be  recomputed  at  each  inference  attempt.  Racher,  they  become  a  permanent, 
defining  part  of  the  premise  graph,  making  it  possible  for  the  deduction 
grapher  to  discover  very  quickly  premises  appropriately  interconnected  for 
its  deduction  tasks. 

Since  the  premise  graph  can  potentially  get  very  large,  the  first  thing 
that  the  deduction  grapher  does  when  given  a  deduction  task  is  to  discover 
plausible  paths  to  middle  terms.  These  middle  terms  serve  to  identify  likely 
nodes  in  the  premise  graph  from  which  proof  proposals  can  be  developed. 

Several  techniques  for  generating  middle-term  paths  are  being  investigated. 

The  one  that  currently  looks  most  promising  is  the  iterative  multiplication 
of  a  predicate  connection  matrix  by  itself. 
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The  subgraphs  picked  out  of  the  premise  graph  ae  proof  proposals  represent 
proo***  what  would  be  valid  considering  only  truth-functional  structure. 

They  *  ve  as  useful  plans  for  working  out  complete  proofs,  i.e.,  proofs 
that  .e  also  valid  from  the  standpoint  of  qc.mt  if  icational  structure. 

The  approach  used  to  validate  proof  proposals  is  to  discover  possible 
collision?  of  substitutions  for  the  variables  contained  in  the  premises 
used  in  a  proof  proposal,  and  then  to  test  ..hether  the  possible  collisions 
are  actual.  To  be  a  valid  proof,  a  proposal  must  be  collision  free. 

An  important  coi  ponent  of  the  deduction  grapher  is  the  truth  evaluator  (T-E) 
described  in  the  final  report  f r  ;  the  previous  contract  period.*  Its 
Importance  lier  in  the  fact  th  c  most  of  the  information  used  in  question- 
answering  inference  is  concrete  (logically  of  the  form  kfP(a)"  or  "R(a,bl" 
where  ‘'a"  and  'V'  are  T.:ot  variables  but  proper  names).  Tnere  are  sc  many 
concrete  facts  that  they  cannot  be  efficiently  stored  in  tne  premise 
grape, instead,  they  are  stored  as  data  sets  in  the  CONVERSE  fact  file. 

The  T-E  component  is  c  *lied  when  needed  by  r.he  deduction  grapher  to  retrieve 
intormat'on  from  the  fact  file. 

An? lysis  of  Fund* penial  Syntactic/ SemanL ic  Relations 

Two  approaches  have  been  followed  in  developing  an  affixal  component  for 
COURSE.  In  the  first  approach,  about  10,000  definitions  of  suffix-words 
were  extracted  from  the  machine-readable  transcript  of  Webster fs  Seventh  New 
Collegiate  Dictionary  (W7)  and  sorted  according  to  the  suffix  and  part-of- 
speech  shift  by  which  they  had  been  formed.  One  hundred  and  fifteen  defini¬ 
tions  were  obtained  for  adjectives  formed  by  adding  '-ful*  to  noun**,,  and  900 
definitions  for  nouns  formed  by  adding  '-ion1  to  verbs.  Alphabetization 
of  the  definitions  within  each  group  asually  brings  out  quite  clearly  the 
major  semantic  functions  of  eaih  suffi::  and  often  suggests  semantic 
categories  in  terms  of  which  select ional  constraints  on  the  suffix  can  be 
stated.  These  constraints  will  enab^  CONVERSE  to  preset,  in  favorable 
cases,  which  semantic  function  is  ^ypropririrt  for  a  given  use  of  the  suffix. 

In  the  second  approach,  we  have  i ought  to  select  and  specify  notienn  suitable 
for  inclusion  In  CCNViRhE's  concept  network  so  that  the  semantic  functions 
of  each  of  the  more  frequently  used  suffixes  can  be  applied  b,r  CONVERSF  to 
derive  the  meanings  oi  words  formed  by  every  such  suffix  from  the  meanings 
of  their  base  forms.  Keywords  of  the  defr.lrg  formulas  use<\  in  W7  defini¬ 
tions  of  suffix- vords,  when  taken  in  the  aenses  in  which  they  a~i  used  in 


Weissman,  C.  Computer-Aided  Command:  final  Semiannual  Technical  iummgry 
Report  to  the  Director,  Advanced  Research  Projects  Agency ,  for  the  period 
16  March  1970  to  13  September  1970.  SDC  document  TM-3628.  September  1970. 
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those  formulas,  provide  an  initial  set  of  affixal  relations  suitable  for 
this  purpose.  Conceptual  analyses  previously  prepared  for  this  set  of 
relations  are  being  refined  in  the  light  of  comprehensive  data  now  being 
obtained  regarding  coordination  patterns  among  the  defining  formulas  in  W7. 

As  soon  as  this  refinement  has  been  completed,  we  will  prepare  brief  con¬ 
ceptual  analyses  of  the  notions  underlying  the  case  relations  now  recognized 
by  the  CONVERSE  grammar.  It  is  already  apparent  that  there  will  be  consider¬ 
able  overlap  not  only  between  these  two  sets  of  relations  but  also  between 
each  and  a  set  of  60  notions  involved  in  thematic  relations  for  which  con¬ 
ceptual  analyses  have  already  been  prepared.  The  notions  falling  within 
the  intersection  of  these  three  sets  are  the  ones  that  we  will  enter  first 
in  CONVERSE's  concept  network.  We  anticipate  that  these  additions  to  the 
network  will  not  only  provide  CONVERSE  with  a  significant  capability  for 
interpreting  affixes,  syntactic  relations,  and  thematic  relations,  but 
also  facilitate  the  statement  of  general  facts  for  use  by  other  components 
of  CONVERSE. 

Cooperation  with  Other  Projects 

W  are  continuing  to  collaborate  closely  with  the  Voice  Input /Output  project 
in  the  long-range  objective  of  attaininp  a  vocal  CONVERSE .  We  believe  that 
present  efforts  at  producing  deep  structures  and  the  forthcoming  capabilities 
?ot  inference  making  will  be  especially  useful  in  reaching  this  long-range 
objective. 


3.1.2  Plans 

Plans  for  the  next  six  months  center  on  two  demonstration  milestones: 

CONVERSE  T»r-1  and  V-2.  V-l  will  be  demonstrated  during  the  next  quarter. 

This  version  vi1 1  demonstrate  a  considerably  enhanced  quest ion -answering 
capability,  one  tn&t  utilizes  our  new  deep-structure  parser  and  a  richer 
data  base.  The  second  milestone,  CONVERSE  V-2 ,  to  be  reached  in  the  last 
quarter  of  the  contract  period,  adds  to  V-l  a  capability  for  recognizing 
and  interpreting  declarative  and  imperative  sentences. 

We  will  demonstrate  user  feedback,  and  features  that  promote  user  confidence. 
In  V-l.  The  user-extensible  features  of  CONVERSE  will  be  demonstrated  in 
V-2.  We  will  continue  efforts  towards  the  ultimate  integration  of  the 
deductive  grapher  program  and  tiles  into  the  overall  CONVERSE  system. 

Finally,  we  will  hngin  to  explore  network  applications  of  CONVERSE  and  the 
q  lestion-answaring  systems  of  other  ARPA  nodes. 
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3.2  GRAPHIC  INPUT/OUTPUT 

Enhancing  communication  between  man  and  computer  by  providing  functions  and 
facilities  that  are  compatible  with,  or  equivalent  to,  techniques  used  in 
visual  man-to-man  communication  is  the  principal  goal  of  the  Graphic  Input/ 
Output  project.  Our  concern  has  been  to  develop  methods  of  graphical  input 
and  output  that  will  permit  a  user  to  carry  on  a  dialogue  with  a  computer 
in  the  language  and  notation  of  his  discipline  or  problem  domain. 

The  functional  entities  required  for  such  a  dialogue  are  a  data-input 
tablet  (e.g.,  the  RAND  Tablet),  an  interactive  CRT  display  operable  as  a 
terminal  in  a  time-sharing  system,  and  a  character-recognition  program  that 
will  accept  the  symbols  used  in  the  notational  expressions.* 

The  near-term  project  goal  is  to  develop  programming  systems  that  utilize 
two-dimensional  notation.  Our  initial  effort  is  to  use  mathematics,  the 
most  ubquitous  scientific  notation,  for  numeric  and  symbolic  programming. 

The  work  is  based  upon  previously  developed  programs  that  accept  two- 
dimensional  mathematical  expressions  hand-drawn  on  the  input  tablet,  extract 
the  explicit  and  implicit  information  from  them,  and  transform  them  into 
representations  amenable  to  existing  processing  techniques. 

3.2.1  Progress 

The  process  of  converting  our  existing  AN/FSQ-32  Time-Sharing  System  programs 
to  the  IBM/ 360  ADEPT  Time-Sharing  System  has  continued  during  this  period. 

All  programs  except  the  Unparser,  the  program  that  converts  a  linear  repre¬ 
sentation  of  a  mathematical  expression  into  a  high-quality  two-dimensional 
representation,  are  now  converted  and  operational. 

In  the  process  of  conversion,  a  new  dictionary-building  program,  one  more 
efficient  and  convenient  than  any  of  its  predecessors,  was  implemented  for 
the  character-recognition  program. 

The  Adding  Machine  (TAM) 

An  initial  language  using  mathematical  notation,  called  TAM  (The  Adding 
Machine)  has  been  designed  and  is  being  implemented.  TAM  allows  arithmetic 
manipulation  using  a  powerful  set  of  operators  as  constants,  variables, 
and  one-dimensional  or  two-dimensional  arrays.  It  provides  looping  facil¬ 
ities,  single-statement  functions,  and  user-defined  input  and  output.  Some 
built-in  functions,  such  as  square  root  and  logarithm,  ano  built-in  constants, 
such  as  II  and  e.  are  provided.  TAM  is  an  incremental  system:  each  state¬ 
ment  is  executed  before  the  next  statement  is  requested. 


Bernstein,  M.  I.  Hand-Printed  Input  for  On-Line  Systems.  SDC  document 
TM-3937 .  April  1968. 
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TAM  is  being  implemented  with  SDC’s  compiler-writing  systems.  The  interpreter 
has  been  designed  and  implemented,  and  is  being  debugged  with  dummy  inputs 
from  a  tape  file.  The  interconnection  routines  between  thv*.  parser — which 
reduces  two-dimensional  mathematical  notation  to  one-dimensional  strings — and 
the  interpreter  have  been  written  and  are  being  debugged. 

Dictionary  Building 

To  build  a  dictionary,  the  user  must  begin  by  supplying  samples  of  his  print¬ 
ing.  In  Figure  3-3 (a)  the  program  has  requested  samples  and  the  user  has 
responded.  In  Figure  3-3 (b)  the  input  characters  have  been  supplied  to  the 
recognizer  and  have  not  been  recognized,  as  indicated  by  the  ’ ? ’ .  There  is 
one  *?f  for  each  unrecognized  input  stroke,  aligned  with  the  stroke,  so  that 
the  user  can  easily  determine  which  characters  were  not  recognized.  The 
user  may  now  select  the  alphabet,  appearing  at  the  top  of  the  screen,  used 
for  defining  the  input  characters.  The  choices,  selected  by  the  light  button 
at  the  lower  right  corner,  are  S,  special  characters — mostly  punctuation 
and  mathematical  symbols;  G,  Greek  letters;  N,  numbers — the  alphabet  shown; 

U,  upper  case;  and  L,  lower  case. 

The  user  defines  a  character  by  encircling  it  and  touching  the  appropriate 
character  in  the  alphabet  at  the  top  of  the  screen,  as  shown  in  Figure  3-3 (c). 
The  system  responds  by  entering  the  definition  in  the  dictionary  and  then 
again  applying  the  character  recognizer  to  the  input  string,  with  the  result 
shown  in  Figure  3-3 (d).  More  than  one  character  can  be  defined  at  a  time, 
as  showr  in  Figures  3-3(e)  and  (f ) .  Figures  3—3 (g)  and  (h)  show  the  defini¬ 
tion  of  an  alphabetic  character. 

The  dictionary  builder  also  contains  a  TEST  mode  that  allows  the  user  to 
test  the  dictionary  he  has  built.  The  user  provides  hand-printed  information, 
as  shown  in  Figure  3-4 (a).  The  input  characters  that  are  recognized  are 
replaced  by  generated  characters  of  the  same  size  and  position,  as  shown  in 
Figure  3-4 (b) .  At  any  time,  the  user  may  return  to  the  dictionary-building 
mode  and  use  the  current  input  to  add  to  the  dictionary,  as  shown  in 
Figures  3-4 (c),  and  (d) ,  and  (e) .  He  may  then  return  to  the  TEST  mode, 

Figure  3-4 (f). 

Implementation  Activities 

Work  has  continued  on  implementation  of  the  character  recognizer  in  the 
Honeywell  DDP-516.  Currently,  the  feature-extraction  portion  of  the 
recognizer  is  operational;  other  portions,  including  dictionary  lookup  and 
a  rudimentary  dictionary-building  routine,  are  coded  but  not  tested. 

We  have  designed  a  multiplexor  to  allow  an  additional  data  tablet,  a  Graf- 
Pen,  to  be  connected  to  the  DDP-516.  Used  with  an  ARDS  display  terminal, 
it  will  provide  another  graphic  I/O  console.  We  have  written  software  to 
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Figure  3-3.  Dictionary  Building 
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Figure  3-4.  TEST  Mode 
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convert  a  display  buffer  intended  for  the  Beta  Instruments  display  for 
use  with  the  ARDS  terminal.  This  allows  console  interchangeability. 

3.2.2  Plans 

We  expect  to  complete  a  demonstrable  version  of  TAM  by  the  end  of  the 
contract  period.  This  version  will  include  the  capabilities  of  the  Unparser 
to  display  user-defined  functions.  As  TAM  is  completed,  we  will  begin  work 
on  a  symbolic  processing  system  for  mathematics,  providing  operations  such 
as  algebraic  manipulation  or  symbolic  differentiation. 

We  expect  to  finish  our  implementation  of  a  character-recognition  program  in 
the  DDP-516  minicomputer.  This  should  provide  faster  response  for  the  user 
by  lightening  the  processing  load  in  the  IBM  360  and  will  further  isolate  us 
from  changes  in  the  parent  computer  hardware.  We  also  plan  to  implement, 
within  the  character  recognizer,  recognition  of  block-diagram  symbols — 
blocks,  circles,  and  lines.  This  will  enable  us  to  expand  our  use  of  two- 
dimensional  notation  and  flow  chartr  and  block  diagrams. 

We  have  been  studying  means  for  making  our  graphics  programs — in  particular, 
the  character  recognizer — available  on  the  ARPA  Network.  We  will  participate 
in  the  development  of  network  graphic  protocols  to  ensure  that  data-tablet 
and  other  information  can  be  accommodated. 

3.3  VOICE  INPUT/OUTPUT 

The  long-term  goal  of  the  Voice  Input /Output  project  is  the  operation  of 
SDC’s  CONVERSE  system  with  vocal  speech  input  and  output.  CONVERSE  (see 
section  3.1)  is  a  natural-language  question-answering  system  that  employs 
a  user-extensible  subset  of  English  and  that  aims,  eventually,  at  employing 
a  virtually  unconstrained  subset  of  English  for  data  base  management. 

In  order  to  achieve  a  vocal  CONVERSE,  we  will  have  to  attack  and  solve 
almost  all  of  the  outstanding  research  problems  regarding  the  recognition 
of  speech  by  a  computer — including  tne  recognition  of  continuous  speech 
(as  opposed  to  the  recognition  of  single  words),  the  ability  to  handle 
large  vocabularies,  and  the  development  of  scanning  processes  and  system 
integration  techniques  that  allow  parsing  and  disambiguation  of  noisy  input. 
Because  of  its  difficulties,  the  task  of  solving  these  problems  must  be 
approached  in  increments.  The  first  two  increments,  a  reimplementation  of 
the  Vicens-Reddy  speech  recognition  system  from  the  Stanford  University 
PDP-10  to  the  SDC  IBM  360/67  and  the  implementation  of  a  vocal  data  manage¬ 
ment  system  that  employs  a  highly  constrained  subset  of  spoken  English, 
are  the  immediate  goals  toward  which  the  Voice  I/O  project  is  working  during 
the  current  contract  year. 
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3.3.1  Progress 

During  the  past  six  months,  project  efforts  have  focused  on  three  major 
objectives: 

1.  The  implementation  of  a  Voice  Laboratory,  including  the  competitive 
procurement  of  a  computer  to  support  the  project's  activities. 

2.  Complete  reimplement at ion  of  the  Vicens-Reddy  speech  recognition 
system  from  the  PDP-10  to  the  IBM  360/67,  and  the  development  of 
software  necessary  to  convert  the  system  as  implemented  on  the 
IBM  360  for  operation  on  the  project fs  own  computer. 

3.  The  development  of  procedures  for  acoustic  processing  and  system 
integration. 

Progress  toward  these  objectives  has  been  substantial.  A  competitive 
procurement  held  among  nine  computer  manufacturers  resulted  in  the  selection 
of  a  Raytheon  704  minicomputer.  The  physical  facilities  for  a  voice 
Laboratory  are  now  available  to  house  the  computer  and  a  sound  booth.  The 
reimplementation  of  the  Vicens-Reddy  speech  recognition  system  on  the 
IBM  360/67  has  been  completed,  and  additional  speech-analysis  subsystems 
have  been  programmed  and  added  to  the  Vicens-Reddy  system  to  expand  its 
capabilities.  Several  systems  procedures  for  processing  continuous  speech 
have  been  developed  and  are  being  documented. 

Voice  Laboratory 

The  Voice  Laboratory  (see  Figures  3-5  and  3-6)  is  now  complete,  and,  with 
the  delivery  of  the  Raytheon  704  computer  on  1  February  1971,  work  is 
progressing  on  a  completely  integrated  speech-data-processing  facility. 

The  Raytheon  704  combines  excellent  signal  processing  capabilities  with 
better  than  normal  software  for  a  minicomputer. 

As  has  been  noted  >n  previous  reports,  the  Voice  Laboratory  uses  highly 
controlled  acoustic  and  audio  systems  (now  in  conjunction  with  the  Raytheon 
704).  The  controlled  audio  system  will  assist  in  speech-deta  input, 
output,  preprocessing,  and  analysis,  and  will  provide  on-line  and  magnetic- 
tape  in»- -rfaces  to  other  computers  and,  possibly,  the  ARPA  Network.  A  key 
physical  facility  within  the  laboratory  is  a  small,  broadcast-quality  sound 
booth  designed  to  allow  interactive  use  of  a  quiet  display  terminal  and  a 
microphone.  Speech  is  collected  by  a  high-quality  condenser  microphone 
and  amplified  by  broadcast  audio  equipment.  A  patch  panel,  a  tape  recorder, 
and  monitoring  equipment  allow  a  wide  variety  of  speech-data  collection, 
interactive  speech  experimentation,  and  voice-output  activities  to  take 
place  within  one  facility. 


analog-to-digital 

CONVERSION  SYSTEM 


April  1971 


31 


System  Development  Corporation 
TM-3628/008/00 


CJ 

O 

•H 

« 

U 

3 

00 


c 

O  CO 
U  <u 


QJ 

4J 

3 


a 

M 
3 
O 
co 

§•£ 

S  60 

G 

4J  *H 
a  4J 
<U  3 

g*  t 
£5 

o  u 

\  Q) 

M  js 
4J 

Q)  O 
O 

*H  TJ 

o  a 
>  cd 


l 

co 

a) 

u 

3 

CO 

*H 

tn 


15  April  1971 


32 


System  Development  Corporation 
TM-3628/008/00 


! 


Figure  3-6.  Input  Hardware  for  the  Voice  I/O  Laboratory 
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Software  Development 

During  the  past  six  months,  the  reimplementation  of  the  Vicens-Reddy  speech 
recognition  system  on  tne  IBM  360/67  has  progressed  to  the  point  at  which 
coding  is  complete  and  checkout  is  in  progress.  Four  documents  (Kameny 
and  Ritea,  TM-4652  series;  see  section  3.5)  describe  the  preprocessing, 
segmentation,  and  recognition  subsystems,  and  explain  the  heuristics  and 
algorithm!?  that  have  been  used  but  not  heretofore  documented.  The  lexicon¬ 
building  and  retrieval  package  is  coded  and  operating  but  not  yet  documented. 
A  rar.th  library  consisting  of  several  versions  of  Discrete  Fourier  Transforms, 
Fast  Fourier  Transforms,  Chirp-Z  Transforms,  and  digital  filters  is  now 
being  tested  for  execution  time,  core  requirements,  and  accuracy,  and  will 
be  documented  when  testing  is  complete.  The  major  remaining  task  is  the 
conversion  of  the  system  from  the  IBM  „  j  to  the  Raytheon  704. 

Several  software  packages  are  being  developed  for  the  Raytheon  704.  A 
mini-LISP  has  been  coded  but  is  not  yet  checked  out.  Ultimately,  it  will 
allow  several  programs  that  are  now  operating  on  the  IBM  360  to  run  on 
the  704  in  combination  with  other  analysis  tasks.  Data  structures  in  the 
mini-LISP  are  character  and  variable-length  identifiers,  nodes,  and 
small  integers.  (If  necessary,  floating-point  numbers  will  be  added  at  a 
later  date.)  We  are  also  developing  a  graphics  package,  a  line-printer 
simulator  using  the  TEKTRONIX  storage  tube  display  and  hard-copy  unit, 
absolute  loaders  for  disc  and  magnetic  tape,  various  utility  programs,  and 
a  processor  that  simulates  parallel,  breadth-first  evaluation.  The  latter 
will  free  us  from  having  to  predetermine  the  order  of  evaluation  of  pattern 
rules,  which,  because  they  are  not  unique,  may  produce  several  different 
"correct"  identifications  of  the  same  acoustic  segment.  Predetermining 
the  order  of  their  evaluation  would  necessitate  eliminating  some  that 
might  either  be  correct  or  provide  useful  additional  information  in  cases 
of  ambiguous  identification  (such  as  may  occur  with  the  consonants  "b" 
and  "t"  in  different  words). 

System  Procedures 

In  order  to  attack  the  problems  of  acoustic  processing  and  system  integration 
two  parallel  efforts  are  in  progress.  The  first  is  the  development  of  CWIPER 
a  subsystem  to  perform  all  acoustic-oriented  chores.  These  include  collect¬ 
ing  the  speech  sample,  performing  initial  segmentations,  extracting  features, 
and  searching  for  items  in  the  utterance.  Parts  of  CWIPER  are  modeled  on 
the  Vicens-Reddy  system;  others  include  additional  hardware  feature- 
extraction  procedures  and  a  new  mapping  and  recognition  procedure  that  does 
not  require  the  input  utterances  to  be  completely  mapped  into  the  symbolic 
pattern  of  the  lexicon  items. 
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The  s-'cond  effort  Involves  the  specification  and  development  of  the  model 
interface  sequencing  procedures  to  construct  the  total  system.  The  mini-LISP 
system  is  being  built  with  modules  that  represent  all  parts  of  the  system 
except  those  that  reside  in  CWIPER.  These  modules  will  oe  simulated  with 
symbolic,  rather  than  acoustic,  inputs.  This  approach  is  being  taken  to 
allow  the  testing  of  algorithms  on  data  of  known  accuracy.  Hopefully,  the 
tested  algorithms  will  be  combined  with  the  actual  acoustic  processors  to 
make  an  initial  version  of  the  vocal  data  management  system  operational 
during  the  next  six  months. 

3.3.2  Plans 


During  the  next  six  months,  work  will  continue  in  severa1  areas.  The  Vicens- 
Reddy  system  will  be  checked  out  on  the  IBM  360  and  converted  to  the 
Raytheon  704.  Several  modifications  are  scheduled  to  improve  the  segmentation 
and  mapping  processors;  they  include  improved  classification  of  fricatives 
using  the  true-RMS  detectors  and  better  handling  and  identification  )f 
transient  segments.  The  mini-LISP  system  for  the  Raytheon  704  will  be 
completed  and  integrated  into  the  processor  library.  Hardware  development 
of  the  acoustic  and  digital  conversion  subsystems  will  be  completed.  The 
sound  booth  will  be  tuned  to  achieve  tl.  optimal,  flat  response  for  which 
it  was  designed.  MUSIC  V,  a  FORTRAN  program  written  by  Max  Mathews  at 
Bell  Labs,  will  be  transferred  to  the  Raytheon  704  to  allow  us  to  test  our 
hypotheses  about  certain  acoustic  rules. 

Completion  of  CWIPER  and  che  interface  model  will  make  possible  the  construc¬ 
tion  of  the  continuous  spee^n  recognizer.  The  prototype  vocal  data  manage¬ 
ment  system  will  demonstrate  the  reliability  and  appropriateness  of  the 
techniques  that  have  been  developed. 

3.4  STAFF 

Natural  Computer  Input/Output  Staff 
M.  I.  Bernstein,  Manager 
CONVERSE  Project 

C.  H.  Kellogg,  Principal  Investigator 

J.  H.  Burger 
T.  C.  Diller 

K.  J.  Fogt 

L*  C*  Travis j  Consu^tants  (Part  time) 
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Graphic  Input /Output  Project 

T.  G.  Williams,  Principal  Investigator 
Jop.i.  Bebb  (part  time) 

Jean  Igawa 
J.  P.  McGahey 
Jean  Saylor 


Voice  Input/Output  Project 

J.  A.  Barnett,  Principal  Investigator 
C.  R.  Kalinowski 
Iris  Kameny  (part  time) 

L.  M.  Molho 
H.  B.  Ritea 

R.  DeCrescent  (part  time) 

3.5  DOCUMENTATION 

Kameny,  Iris,  and  H.  Barry  Ritea.  Analysis  and  Development  of  the  Vicens- 
Reddy  Speech  Recognition  System:  Table  of  Contents  for  Document  Series 
TM-4652 .  SDC  document  TM-4652/000/00.  December  1970. 

Kameny,  Iris,  and  H.  Barry  Ritea.  Introduction  and  Overview  of  the  Vicens- 
Reddy  Speech  Recognition  System.  SDC  document  TM-4652/001/00.  December  1970. 

Kameny,  Iris,  and  H.  Barry  Ritea.  Description  and  Analysis  of  the  Vicens- 
Reddy  Preprocessing  and  Segmentation  Algorithms.  SDC  document  TM-4652/ 200/00. 
December  1970. 

Kameny,  Iris,  and  H.  Barry  Ritea.  Description  and  Analysis  of  the  Vicens- 
Reddy  Recognition  Algorithms.  SDC  document  TM-4652/300/00 .  March  1971. 

Kellogg,  C.  CONVERSE  Plan  (1/5/71  -  9/5/71).  SDC  document  N-(L)-24445. 
February  1971. 

Kellogg,  C.,  J.  Burger,  T.  Diller,  and  K.  Fogt.  "The  CONVERSE  Natural 
Language  Data  Management  System:  Current  Status  and  Plans."  Proceedings  of 
the  Symposium  on  Information  Storage  and  Retrieval,  April  1-2,  1971, 
University  of  Maryland.  Pp.  33-46.  Published  by  the  Association  for 
Computing  Machinery. 


Travis,  L.  A  New  Approach  to  Implementing  anu  Using  Inference  in  a  Question 
Answering  System:  Plans  for  CONVERSE.  SDC  document  N-24463.  February  1971. 
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4.,  SYSTEMS  RESEARCH 


The  systems  research  tasks  involve  the  development  of  new  hardware  and  soft¬ 
ware  tools  and  techniques  that  are  important  to  thr-  scientific  community 
and  contribute  to  the  fabric  ition  of  a  computer-assisted  planning  system. 

These  projects  presently  include  the  ARPA  Network  development  and  the  Graph- 
Meta  analysis  work,  with  each  effort  including  bo  :n  deaign  and  implementation 
aspects  of  the  overall  system  development.  In  the  case  of  the  Graph-Meta 
research,  the  design  is  based  on  °  rigorous  analytical  foundation  in  graph 
theory,  while  the  network  efforts  are  more  application  oriented,  involving 
the  HOST-to-HOST  protocol  development  and  studies  of  distributed  data-base 
systems  in  a  computer  network.  Both  projects  have  been  quite  active  during 
the  reporting  period,  and  will  soon  have  facilities  available  for  experimental 
usage  by  the  other  projects. 

4.1  NETWORKS 

The  goal  of  the  Ne-  works  project  is  to  mare  the  SDC  ADEPT  Time-Sharing  System 
an  operating  part  of  the  ARPA  Network,  and  to  explore  ways  in  which  it  can 
both  contribute  resources  to  the  ARPA  community  ar.d  benefit  from  the  services 
the  community  makes  available  to  us.  As  part  of  our  network  effort,  we  are 
also  investigating  "the  distributed  data  base  problem",  and  have  considered 
a  number  of  possible  ways  of  integrating  dissimilar  data  management  systems. 
Because  of  the  needed  long  lead-time  for  developing  a  feasible,  practical 
approach,  this  effort  is  being  pursued  in  parallel  with  the  HOST-to-HOST 
protocol  implementation,  so  that  we  can  utilize  the  ARPA  Network  for  experi¬ 
ments  when  ADEPT  is  integrated  into  it. 

The  ADEPT  system  runs  on  an  IBM  360/67  and  utilizes  a  Honeywell  DDP-516 
peripheral  computer  as  an  interactive  I/O  handler.  The  ARPA  Interface 
Message  Processor  (IMP)  is  also  connected  to  the  DDP-516  computer,  which 
collects  and  passes  messages  between  the  Network  and  the  IBM  360.  The 
programs  chat  accomplish  this  in  the  DDP-516  end  in  the  IBM  360  have  been 
coded  and  debugged  and  are  in  use  now  in  our  w->rk  to  integrate  ADEPT  into 
the  Network.. 

The  subsystem  of  ADEPT  ;hat  interfaces  with  the  Network  is  called  HOSTOSS 
(Host  Operating  SJubSystem)  ,  and  has  oeen  designed  to  have  the  following 
features: 


1.  Interprocess  communication.  Programs  running  under  ADEPT  can 

ccmmunicate  with  each  other  and  with  programs  running  elsewhere  in 
the  Network.  All  users  can  simultaneously  have  m-itiple  connections. 
(Presently,  there  is  an  overall  limit  of  32  connections  for  ADEPT, 
but  this  can  be  increased,  if  necessary,  by  enlarging  some 
system  tables.) 


15  April  1971 


37 


System  Development  Corporation 
TM-3628/008/00 


2.  Ability  to  log  in  on  remote  systems.  This  can  be  accomplished  by 

writing  a  user-level  program  incorporating  the  TELNET  specifications 
and  using  interprocess  communication.  The  TELNET  specifications  will 
describe  the  standard,  detailed  protocol  to  be  used  by  all  nodes 
in  teletypewriter-level  (character,  text,  terminal)  communication. 


3. 


Ability  to  make  ADEPT  available  to  remote  users.  Initially,  we 
will  make  one  of  our  10  job  entries  available  to  the  external  ARPA 


community.  H0ST0SS  is  written  in  such  a  way  that  it  can  be  modi¬ 
fied  to  dynamically  vary  the  number  of  jobs  available.  That  task 
will  be  accomplished  next  year  after  system  experience  is  gained 


and  user  needs  are  more  clearly  identified. 


4.1.1 


Progress 


HOST-to-HOST  Protocol  Implementation 


HOSTOSS  has  been  designed,  flowcharted,  coded  in  assembly  language,  integrate** 
into  a  new  version  of  ADEPT,  and  successfully  loaded.  The  non -Network  com¬ 
ponents  of  the  system  are  running;  the  Network  components  are  in  the  process 
of  being  debugged.  For  debugging,  we  have  incorporated  into  our  LISP  system 
a  set  of  Network  primitives  that  allow  a  LISP  user  to  INITIATE,  LISTEN  for, 
ACCEPT,  and  CLOSE  Networ1  connections  as  well  as  communicate  on  open  con¬ 
nections.  These  LISP  functions  have  been  thoroughly  debt  ^ed  and  are  used 
to  invoke  the  system  when  we  are  debugging  our  Network  programs.  LISP  is 
a  guinea-pig  user  of  the  Network  at  the  user  level  and  was  chosen  because  of 
familiarity,  ease  of  programming,  and  flexibility  while  debugging.  It  also 
has  the  desirable  side  effect  of  putting  our  LISP  system  on  the  Network  early. 


The  implementation  of  the  HOST-to-HOST  protocol  has  required  several  system 
changes  in  ADEPT  to  handle  necessary  buffering,  interprocess  communications, 
and  control  features.  These  are  non-trivial  critical  changes  to  the  basic 
ADEPT  terminal  control  architecture.  Subsequent  protocol  changes  have,  there¬ 
fore,  required  extensive  recoding  and  further  checkout;  although  the  changes 
have  long-term  advantages,  their  short-term  effects  have  been  considerable 
delay  in  our  implementation  phase.  The  most  serious  effects  were  caused 
by  the  decision  *o  treat  a  Network  connection  as  an  asynchronous  "bit  pipe," 
where  the  sendei  must  be  able  to  send  arbitrary  data  amounts,  with  pauses 
when  necessary,  and  the  receiver  must  be  able  to  collect  these  fragments 
into  a  logical  message.  This  required  a  redesign  of  our  buffering  mechanism. 
Other  system  changes  were  required  when  the  "marking"  protocol  was  abandoned. 
Network  programs  in  our  case  are  system  routines  (HOSTOSS  has  a  total  of  six 
system  components,  five  in  the  IBM  360  and  one  in  the  DDP-516)  ,  and  must  be 
thoroughly  debugged  before  they  are  released  for  general  use.  Since  SDC 
management  has  insisted  on  high  reliability  of  our  time-sharing  system. 
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The  Network  activity  is  in  the  center  of  the  constant  battle  between  main¬ 
taining  high  system  reliability  and  improving  system  capabilities. 

Current  debugging  indicates  that  the  various  H0ST0SS  components  are  properly 
scheduling  each  other  and  are  communicating  properly  through  the  common 
tables,  in  closed-loop  tests  we  have  been  able  to  send  an  INIT  command 
through  the  IMP  (to  ourselves)  and  receive  it  and  the  RFNM  (acknowledgement) 
back.  We  are  presently  debugging  the  code  that  transmits  the  data  during 
interprocess  communication,  and  expect  to  have  it  working  soon.  This  code 
uses  the  older  protocol  with  "marking;"  flowcharting  of  the  modifications  to 
incorporate  the  new  protocol  is  in  progress. 

Several  problems  encountered  with  the  IMP-HOST  interface  have  also  delayed 
the  protocol  checkout.  BBN  is  working  with  us  to  resolve  one  aspect  of  the 
problem,  which  involves  the  apparent  loss  of  messages  within  the  IMP. 


The  HOST-to-HOST  protocol  developments  and  other  Network  coordination  efforts 
have  been  pursued  by  participation  in  the  Network  working-group  meetings. 

Also,  SDC  is  represented  on  the  TELNET  committee  developing  the  terminal- 
level  protocol  specifications.  Other  activities  have  included  discussions  at 
SRI,  BBN,  MITRE,  and  Utah  on  data  base  applications  of  the  network. 

Distributed  Data  Base  Study 

The  purpose  of  the  distributed  data  base  study  is  to  investigate  techniques 
foi  developing  a  distributed  data  management  system  on  a  computer  network, 
with  the  ARPA  Network  specifically  in  mind.  Three  approaches  were  considered: 

1.  A  central  data  management  system,  powerful  enough  to  satisfy  most 
users'  needs.  The  system  could  have  distributed  data  at  the 
various  network  sites,  but  the  retrieval  and  update  mechanisms 
would  be  centralized. 

2.  Integration  of  local  data  management  systems  and  local  data  (files) 
through  the  use  of  a  common  data  management  language  and  appropriate 
local  interfaces. 

3.  Development  or  adoption  of  a  particular  da*- a  management  system  to 
be  implemented  on  all  nodes;  the  retrieval  and  update  language  and 
the  logical  data  structures  would  be  standardized. 

After  considering  these  alternatives,  we  chose  approach  2:  to  investigate 
the  integration  of  existing  data  management  systems  through  the  use  of  a 
common  data  management  language  and  appropriate  local  interfaces  that  would 
translate  the  functions  described  in  the  common  language  into  the  local 
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data  management  language.  The  main  advantage  of  this  approach  is  that  it 
permits  the  continued  use  of  existing  data  management  systems  with  existing 
data  bases  associated  with  them,  while  facilitating  the  sharing  of  the 
date  among  the  network  community  of  users.  This  approach  does  not  inhibit 
further  development  of  different  local  data  management  systems  (which  is  the 
case  on  approach  3).  It  is  anticipated  that,  under  normal  use,  data  will 
be  manipulated  locally,  for  the  most  part,  and  less  often  shared  throughout 
the  network.  Approach  2  permits  continued  use  of  local  systems.  Other 
advantages  that  this  approach  provides  are  a  non-central ized  data  management 
system,  which  is  desirable  in  case  of  a  failure  of  one  facility  (node), 
and  a  greater  likelihood  of  acceptance  by  users  because  of  the  use  of  local 
systems.  As  a  first  step,  a  few  data  management  systems  that  are  (or  will  be) 
available  on  the  ARPA  Network  were  chosen,  and  will  be  used  as  a  mudel  for 
the  development  of  the  common  language  and  the  local  interfaces. 

The  problems  and  pitfalls  identified  so  far  are: 

1.  Can  a  useful  common  language  be  defined  that  will  provide  a 
convenient  general  way  of  describing  functions  such  as  retrievals 
and  updates?  Could  English  (e.g.  ,  CONVERSE  English)  be  tne  common 
language? 

2.  How  complex  will  local  interfaces  turn  out  to  be,  and  would  it 

be  feasi-ble  to  implement  them  locally?  Can  meta-compiling  techniques 
generate  a  family  of  translators  from  CONVERSE-like  intermediate 
language  (IL)  to  local  DMS  language? 

3.  Might  the  extra  layers  of  interfaces  cause  on-line  response  to  be 
too  slow?  Alternatively,  can  a  central  node  perform  English-to-IL- 
to  DMSj.  for  all  nodes? 

There  is  a  relation  between  this  work  and  work  being  done  by  others  in  the 
ARPA  Network.  MITRE’s  study  to  demonstrate  the  feasibility  of  accessing  data 
located  at  a  remote  site  would  help  to  determine  the  practicality  of  sharing 
data  on  the  Network.  CCA’s  "centralized"  data  management  system  with  the 
trillion-bit  memory  does  not  conflict  with  this  work,  since  it  can  be  viewed 
as  another  local  system.  (A  possible  common  area  of  interest  is  the 
development  of  a  common  language.)  Also,  the  trillion-bit  memory  can  be  used 
for  on-line  storage  of  files  to  be  used  by  local  interfaces  (files  containing 
the  correlation  between  data  items  of  different  local  files).  Another 
related  area  of  work  is  the  development  of  the  "Form  Machine."  We  partici¬ 
pated  in  this  committee  and  are  considering  a  possible  use  of  the  Form 
Machine  for  transformation  of  data  transmitted  between  the  local  interfaces. 

4.1.2  Plans 


HOST-to-HOST  Protocol 


Since  the  HOST-to-HOST  protocol  is  expected  to  remain  in  a  state  of  flux  for 
some  time,  one  of  our  long-term  objectives  will  be  to  continue  the 
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ADEPT/HOSTOSS  evolution  to  meet  the  changing  requirements.  Shorter  term 
objectives  include  complete  debugging  of  our  present  implementation — in 
particular,  the  interprocess  communication  and  the  TELNET  functions — and 
the  necessary  coordination  and  user/program  interfaces  to  allow  experimental 
applications  of  the  network  during  the  summer.  These  experiments  are 
currently  being  formulated  in  cooperation  with  other  projects  described 
in  this  report  and  with  other  nodes  in  the  ARPA  Network. 

Distributed  Data  Base  Study 

Our  plans  for  the  next  half  year  are  to  develop  a  first  cut  at  the  use  of 
CONVERSE-like  English  as  the  common  language  and  to  specify  the  specific 
local  interfaces  involved.  These  developments  will  be  coordinated  with 
other  interested  ARPA  participants.  The  end  product  of  these  efforts  will 
be  a  HOST-to-HOST  data  management  protocol  to  complement  those  of  TELNET, 
and  graphics  that  are  currently  under  developnent. 

4.2  GRAPH-META 

When  our  graph-meta  work  began,  in  1966,  we  were  concerned  with  purely 
syntax-directed  compilers.  Later,  a  compiler-writing  system  wts  produced 
that  handled  local  code  optimization.  Currently,  we  are  attacking  the 
problem  of  global  optimization,  i.e.,  optimization  based  on  graph  analysis 
of  the  control  flow  within  a  procedure.  Although  work  has  been  done  elsewhere 
in  global  optimization,  we  are  pioneering  its  integration  into  a  compiler¬ 
writing  system  to  permit  automatic  generation  of  code  whose  quality 
approaches,  if  not  surpasses,  that  of  handcrafted  code.  To  this  end,  the 
Generators  language  is  being  extended  to  facilitate  the  programming  of 
algorithms  involving  directed  graphs.  New  operators,  ^ata  structures,  and 
data  types  are  being  added  to  the  language.  Statistical  studies  have  been 
made  to  determine  what  type  of  storage  structure  is  best  suited  for  imple¬ 
menting  these  language  features  for  the  production  of  practical  optimizing 
compilers. 

In  addition  to  its  application  in  compiler  optimization,  we  hope  that  the 
new  language  will  be  powerful  enough  to  assist  in  applying  graph  theory  to 
network  theory,  communication  theory,  circuit  design,  and  resource  allocation. 

Global  optimization  is  highly  dependent  upon  the  availability  of  detailed 
information  concerning  communication  in  computer  programs.  In  particular, 
questions  arise  abcut  the  flow  of  control  and  information,  answers  to  which 
completely  determine  the  degree  to  which  the  program  may  be  improved 
(optimized).  For  example,  consider  the  sequence  or  code: 


(1) 

a  «-  b  *  c 

(2) 

d  b 

(3) 

e  c  *  d 
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It  is  known  that  multiplication  (*)  is  a  commutative  process;  it  follows  that, 
under  ideal  conditions,  transitivity  would  identify  the  formal  expressions 

b  *  c  =  c  *  b 

and  c  *  d  =  d  *  c 

and,  with  d  *  b,  allow  statement  (3)  to  be  replaced  by 
(31)  e  «-  a 

It  is  necessary,  however,  that  the  "ideal"  conditions  be  specified  and 
satisfied  before  any  attempt  is  made  to  modify  the  code.  Clearly,  an  under¬ 
standing  of  the  edited  code  is  critical.  For  example,  if  there  exists  some 
path  in  the  program  from  a  definition  of  the  variable  c  to  statement  (2) 
or  (3)  that  does  not  pass  through  statement  (1) ,  the  substitution  would  be 
invalid  and  would  produce  incorrect  results.  Nor  need  this  be  the  best 
transformation;  if  the  indicated  code  lies  in  a  portion  of  the  program  that 
is  never  executed,  it  would  be  best  to  eliminate  it  altogether. 

Matters  relating  to  the  safety  of  a  transformation  are  also  of  concern.  For 
example,  it  would  apDear  that  any  expression,  occurring  in  a  loop,  that 
maintained  a  cons  tar.*:  value  in  the  loop  should  be  computed  in  some  portion 
of  the  program  (i.e.  outside  the  loop)  that  lay  on  all  possible  paths  to  the 
loop  but  with  a  lower  execution  frequency.  However,  consider 

do  i  «-  1  to  n  by  1 

begin 

if  b  =  0 

then  a(i)  0 

else  a(i)  a/b  end 

end 

Here  a/b  is  constant  within  the  loop,  and  one  would  be  tempted  to  transform 
the  code  to  the  form 

t  «-  a/b 

do  i  «-  1  to  n  by  1 
begin 

if  b  -  0 

then  a(i)  «-  0 
else  a(i)  «-  t  end 

end 
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But  if  b  ■  0,  the  computation  a/b  will  cause  a  divide  check,  a  result  that 
would  not  have  occurred  in  the  original  code.  A  safer  approach  might  be 

if  b  *  0 

then  t  «-  0 

else  t  «-  a/b  end 

do  i  «-  1  to  n  by  1 

begin 

a(i)  «-  t 

end 

The  number  of  special  cases  requiring  analysis  is  large,  and  a  rigorous 
mathematical  foundation  is  necessary  to  produce  an  optimizing  compiler  that 
produces  correct  optimal  code.  This  is  being  done  with  the  mathematical 
theory  of  graphs  applied  to  program  flow.  The  flow  of  control  in  a  program 
can  be  modeled  by  a  directed  graph.  An  algebraic  system  may  be  joined  to 
the  graph  to  provide  a  basis  for  analyzing  information  flow,  computational 
redundancy  and  invariance,  execution  frequency  and  computation  efficiency, 
data  dependencies,  resource  and  register  allocations,  and  so  forth. 

4.2.1  Progress 

The  major  goal  of  the  project  for  this  contract  period  is  to  produce  an 
experimental  optimizing  FORTRAN-IV  compiler  as  a  benchmark  for  evaluating 
and  demonstrating  optimization  with  the  graph-manipulator  features  of  the 
Generators  language  of  META  in  a  practical  situation.  This  compiler  will 
consist  of  three  passes.  The  first  pass  has  been  coded  and  is  in  the  process 
of  checkout.  Design  has  begun  on  passes  two  and  three.  No  coding  has  been 
done  on  the  second  pass,  and  very  little  on  the  third.  These  last  two 
passes  are  discussed  in  the  section  on  plans. 

The  theoretical  basis  for  this  compiler,  and  other  work  at  SDC  in  compiler 
optimization,  is  embodied  in  a  formal  text  rapidly  reaching  completion 
this  year.  Progress  is  excellent.  Part  I  of  the  two  pa.*ts  has  been 
completed;  Part  II  is  more  than  half  finished. 

Benchmark  Compiler  Pass  I 

The  first  pass  performs  syntax  analysis,  converts  DO  loop  and  Boolean 
expressions  into  GOTO’s,  and  allocates  storage,  taking  into  account  equiv¬ 
alence  declarations.  The  storage-allocation  subroutine  is  especially 
interesting  because  it  takes  only  3/4  of  a  page  of  text  when  written  in  the 
Generators  language.  In  the  CDC  extended  FORTRAN,  which  was  written  in 
FOFTRAN,  this  routine  takes  about  six  written  pages.  As  a  further  measure  of 
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the  Generators  language’s  power  of  expression,  a  number  of  algorithms 
involving  directed  graphs  were  written  and  checked  out  in  ALGOL  and  APL. 

The  algorithm  for  ordering  the  nodes  of  a  graph  took  six  written  pages  in 
ALGOL,  20  lines  in  APL,  and  12  lines  in  the  Generators  language. 

New  features  have  been  added  to  the  Generators  language  for  use  in  the 
optimization  pass.  These  include  floating-point  arithmetic,  which  is  needed 
for  compile-time  evaluation  of  expressions,  and  arrays.  One-dimen 8 iona] 
Boolean  arrays  have  been  added  to  represent  definition  use  vectors.  Graphs 
themselves  may  be  represented  as  two-dimensional  Boolean  arrays  or  as  list 
structures. 

Optimization  Text:  Part  I 

A  treatise  entitled  "A  Mathematical  Theory  of  Global  Program  Analysis” 
is  in  preparation  for  publication  as  a  textbook  this  year.*  The  book  is 
divided  into  two  major  sections;  the  first  is  a  development  of  the  mathe¬ 
matical  structures  involved  in  Global  program  analysis,  and  the  second 
is  devoted  to  the  applications  of  this  theory  to  the  problem  of  optimization. 
The  subjects  covered  are  indicated  in  the  table  of  contents; 


Introduction 

Part  I:  Theoretical  Foundations 

1.  Preliminary  Notation 

2.  Weak  Ordering  Associated  with  a  Graph 

3.  Dominance,  Partitions  and  Internals  of  Graphs 

4.  Derived  Intervals,  Reducible  and  Irreducible  Graphs 

5.  Vertex  Ordering  Algorithms 

6.  Lattice  Algebra  and  the  Reduction  of  Irreducible  Graphs 

7.  The  Connectivity  Matrix  and  Prime  Cycles 
Part  II:  Global  Program  Optimization 

8.  Data  Flow  Analysis,  Dependency  and  Redundancy  Equations 

9.  Constant  Subsumption,  Common  Subexpression  Suppression  and 
Code  Motion 


Schaefer,  Marvin.  A  Mathematical  Theory  of  Global  Program  Analysis. 
SDC  document  TM-(L)-4602.  August  1970. 
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10.  Loop  Optimization,  Invariant  Expression  Removal  and 
Reduction  in  Strength  of  Operators 

11.  Safety  Considerations 

12.  Dead  Expression  Elimination 

13.  Subroutine  Linkages 

14.  Execution  Frequency  Considerations 

15.  Register  and  Storage  Allocation 
Bibliography 

Appendix  I:  Graph  Theoretic  Algorithms  in  APL/360 

The  general  problem  of  determining  the  logical  and  computational  equivalence 
of  two  programs  is  known  to  be  recursively  unsolvable,  and  is  not  treated 
in  the  text.  Although  the  result  is  of  theoretical  interest,  it  is  not 
directly  relevant  to  the  problem  of  optimization  because  the  compiler  writer 
has  control  ~3r  the  transformations  employed  in  the  optimization.  The 
necessary  and  sufficient  conditions  for  preserving  equivalence  are,  however, 
described  in  the  chapter  on  Safety  Considerations. 

4.2.2  Plans 

The  bulk  of  our  activities  for  the  latter  half  of  the  contract  will  be 
devoted  to  checking  the  theoretical  basis  of  our  algorithms  via  the  FORTRAN 
compiler  implementation.  These  activities  will  focus  on  the  completion  of 
Passes  II  and  III,  and  draft  publication  of  the  optimization  text. 

Benchmark  Compiler  Pass  II 

Global  optimization  takes  place  during  the  second  pass  of  the  compiler.  It 
is  done  on  the  basis  of  a  directed  graph  that  represents  the  flow  of 
control  of  the  program.  That  is  why  DO  loop  and  Boolean  expressions  were 
expanded  into  control  statements  during  the  first  pass.  Some  compiler 
writers  have  feared  th  information  would  be  lost  in  such  a  process,  but 
this  does  not  prove  to  be  the  case.  All  of  the  loop  structures  that  were 
originally  represented  by  DO’s  can  be  recognized  from  the  graph,  in  addition 
to  loops  that  the  FORTRAN  programmer  himself  coded  with  GOTO’s.  Most  of 
the  research  described  in  the  optimization  text  will  be  used  to  advantage 
in  this  pass. 
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Benchmark  Compiler  Pass  III 

The  third  pass,  the  code-generation  pass,  is  similar  to  code  generation  in 
locally  optimizing  compilers.  A  new  algorithm  being  employed  here  is  the 
use  of  a  few  general  registers  augmented  by  core  for  temporary  storage. 

This  means  that  intermediate  results  will  normally  be  kept  in  registers  but 
will  be  stored  when  an  insufficient  number  of  registers  are  available. 

It  looks  as  if  it  will  be  possible  to  program  this  entire  algorithm  in  the 
Generators  language  without  adding  any  special  features.  This  has  the 
advantage  of  giving  the  compiler  writer  control  over  register  allocation 
rather  than  freezing  a  scheme  into  the  compiler-writing  system. 

Several  other  problems  are  being  encountered  in  writing  this  third  (code 
generation)  pass.  They  are:  (1)  how  to  automatically  combine  library 
routines  into  a  user’s  program;  (2)  what  type  of  subroutine  linkage  to 
use;  and  (3)  how  to  pass  addresses  to  a  subroutine,  since  the  current 
compiler-writing  system  allows  address  constants  only  in  common.  Several 
solutions  are  available  for  each  problem,  so  it  is  only  an  engineering 
problem  of  evaluating  them  and  choosing  the  best. 

Optimization  lext:  Part  II 

The  second  part  of  the  optimization  text  is  being  written  and  will  be 
finished  during  the  latter  half  of  the  contract.  Preliminary  drafts  will  be 
circulated  among  experts  in  the  field  for  scrutiny.  A  revision  based  on 
review  comments  and  feedback  from  the  FORTRAN  application  will  be  incorpo¬ 
rated  into  the  text.  Supplementary  appendices  containing  relevant  statis¬ 
tical  information  may  be  prepared  as  time  and  resources  permit. 
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5.  INTERACTIVE  SYSTEMS 


Interactive  systems  projects  reported  here  include  Problem  Solving  and  Learn¬ 
ing  by  Man-Machine  Teams,  Time-Sharing,  and  LISP  Extensions,  and  are  part  of 
the  larger  task  of  Systems  Research  covered  in  section  4.  However,  they  have 
been  broken  out  for  separate  treatment  here  because  they  all  deal  with  man- 
machine  interfaces,  whereas  section  4  tasks  are  focused  on  internal,  machine- 
machine  problems.  Furthermore,  two  of  the  projects — Time-Sharing  and  LISP 
Extensions — are  "shadow  activities"  in  that  they  are  mandatory  supporting 
efforts  to  the  more  visible  research  reported  earlier. 

The  Problem  Solving  project  is  aimed  at  building  a  model,  called  Gaku,  of 
man-machine  cooperation  in  solving  complex,  non-deterministic ,  real-world 
problems.  Successful  progress  toward  that  goal  is  the  implementation  of  a 
User-Adaptive  Language  (UAL)  for  stating  to  the  computer  the  problems  and 
the  interactive  steps  needed  for  their  solution. 

Our  time-sharing  system,  ADEPT,  supports  the  bulk  of  our  research  program, 
and  although  no  research  effort  is  specifically  expended  for  its  expansion, 
development  effort  is  spent  to  accommodate  the  research  objectives  of  other 
projects,  most  notably  Networks,  CONVERSE,  and  Problem  Solving.  Developments 
embodied  in  ADEPT  Releases  8.8  and  8.9  include  a  near-aoubling  of  user- 
program  memory  to  85  pages  (approximately  348,000  bytes)  of  core,  completion 
of  the  Object  Sub-System  control  mechanism,  flexible  terminal  support  for 
a  wide  variety  of  baud-rate  devices,  and  a  dramatic  improvement  in  system 
reliability  that  has  resulted  in  a  quadrupling  of  system  mean  time  to  failure. 

5.1  PROBLEM  SOLVING  AND  LEARNING  BY  MAN  MACHINE  TEAMS 

The  long-range  goal  of  this  project  is  the  development  of  a  man-machine  system, 
called  Gaku,  that  couples  the  capabilities  of  man  and  computer  for  cooperative 
planning  and  creative  problem  solving  in  practical,  real-world  situations. 

The  design  of  Gaku  is  a  conceptual  step  toward  the  realization  of  a  syner¬ 
gistic  system — that  is,  one  that  combines  built-in  computer  capabilities  and 
human  intellectual  capabilities  to  promote  the  dynamic  extension  of  both 
kinds  of  capabilities  through  man-machine  interaction,  leading  to  a  "co-evolv- 
ing"  man -machine  team. 

Previous  models  of  Gaku  were  designed  mainly  for  preliminary  exploration  of 
and  experimentation  with  research  ideas  and  techniques,  and  were  intended 
for  a  single  user.  The  past  year’s  accomplishments  included  (1)  a  new 
design  of  Gaku  that  incorporates  team  planning  and  problem  solving  and  that 
aims  at  real-world  problems,*  (2)  the  design  and  detailed  specification 


Hormann,  Aiko  M.  Planning  by  Man -Machine  Synergism:  Characterization  of 
Processes  and  Environment .  SDC  document  3P-3484.  March  1970. 
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of  a  User-Adaptive  Language  (UAL) ,*  and  (3)  the  examination  of  many  real- 
world  problems  that  are  in  need  of  a  man- machine  approach. 

UAL  was  designed  to  provide  a  convenient  and  flexible  means  of  man-machine 
communication.  It  enables  the  user  to  begin  interacting  with  Gaku  at  the 
initial  conceptual  stage  of  problem  definition  and  continue  through  the 
exploratory  stages  of  problem  solution.  UAL  is  used  for  two  purposes: 
as  a  programming  language,  it  is  used  for  initial  Gaku  implementation;  in 
its  extended  form,  it  is  used  for  user-Gaku  interaction  in  problem  solving 
and  for  designer-Gaku  interaction  in  system  modification.  Higher-level, 
user-oriented  functions  and  capabilities  can  be  constructed  for  the  user* 8 
convenience  through  the  extensible  features  of  UAL  and  its  techniques  for 
building  problem-oriented  primitives. 

Many  areas  of  potential  application  were  examined,  some  in  detail.  Several 
areas  (or  classes)  of  real-world  problems  to  which  man-machine  technir*  es 
may  be  fruitfully  applied  have  been  characterized,  and  the  types  of  t  ision 
dynamics  influenced  by  these  characteristics  have  been  identified  (Hormann, 
International  Journal  of  Man-Machine  Studies) .  Military  applications  in  both 
strategic  and  tactical  planning  have  also  been  investigated. 

5.1.1  Progress 

Work  during  the  current  contractual  period  is  focused  on  three  related  areas — 
UAL  implementation,  foundational  work  toward  prototype  Gaku  development,  and 
problem  applications. 

UAL  Implementation 

Prototype  UAL  is  being  implemented  on  the  ADEPT  time-sharing  system,  using 
LISP  1.5  as  its  source  language.  Efficiency  in  the  use  of  both  computer  time 
and  core  memory  space  was  sacrificed  in  order  to  speed  the  implementation. 
Progress  has  been  rapid  since  this  policy  was  adopted,  and  a  modestly 
sophisticated  demonstration  of  man-machine  interaction  is  now  possible. 

UALfs  current  capabilities  include  complex  list-structure  manipulation, 
function  creation,  and  extensibility  (the  ability  to  create  new  primitives 
and  functions  and  to  define  new  infix  operators).  More  ambitious  UAL 
features  are  yet  to  be  implemented. 

The  major  problem  encountered  early  during  this  reporting  period  was  the 
limited  core  memory  available  in  LISP.  This  problem  was  alleviated  by  a  new. 
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"growable"  LISP  system  that  became  available  during  the  period  (see  section 
5.3).  However,  considerable  space  in  the  new  system  will  be  consumed  by  Gaku 
implementation,  leaving  little  for  user-generated  programs.  Response  time 
is  currently  tolerable,  but  will  not  be  when  more  complex  functions  for 
list  manipulation  are  exercised.  The  impact  of  these  limitations  is  serious, 
since  relaxation  of  the  constraints  is  beyond  immediate  cost  and  system 
practicality.  A  major  re-evaluation  of  the  technical  approach  is  in  progress. 

Foundational  Work  Toward  Prototype  Gaku  Development 

From  the  many  features  in  the  Gaku  design,  a  very  basic  set  has  been  selected 
for  the  first  phase  of  development.  These  features  are  being  implemented  in 
UAL,  in  a  pencil-and-paper  simulation,  so  that  when  UAL  is  ready,  basic  Gaku 
can  become  operational.  So  far,  only  rudiments  of  Gakufs  executive,  user- 
portrayal,  and  graphic-display  components  have  been  written  in  UAL.  (The 
user-portrayal  component  allows  the  user  to  leave  unspecified  portions  that 
are  to  be  filled  in  later  as  on-line  decisions.) 

Problem  Applications 

Many  problems  in  the  areas  of  military  planning  and  logistics,  law  enforce¬ 
ment  and  criminal  justice,  health-care  programs,  and  business  planning  and 
management  have  been  examined  with  the  following  criteria  in  mind: 

1.  The  problem  is  sufficiently  complex  and  ill-defined  (and  unsolved) 
that  the  use  of  man-machine  synergistic  techniques  may  open  up 
new  possibilities  of  solving  it,  or  at  least  of  coping  with  its 
decision-making  intangibles  logically  and  systematically.  The 
man-machine  team  will  able  to  examine  a  much  larger  number  of 
alternatives,  weighing  many  different  factors,  than  if  normally 
might  before  a  final  decision  has  to  be  made. 

?  The  need  for  good  decisions  (or  answers,  or  new  insights  and 

understanding)  is  sufficiently  urgent,  and  the  possible  effects  of 
a  wrong  (or  inadequate)  decision  are  serious  enough,  that  strong 
arguments  can  be  made  for  investing  the  extra  time,  effort,  and' 
funds  that  would  be  necessary  to  employ  man-machine  synergistic 
techniques. 

3.  The  problem  poses  processing  requirements  that  are  within  the 

capabilities  of  the  current  SDC  facilities.  The  conditions  to  be 
met  include  tolerable  memory  requirements,  maximum  tolerable  on¬ 
line  response  time  (different  time  limits  are  likely  to  be  required 
for  different  types  of  interaction) ,  and  sophistication 
in  the  use  of  the  display-scope  techniques. 
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Although  many  of  the  problem  situations  that  have  been  examined  meet  the  first 
two  criteria,  only  a  few  meet  the  third;  most  require  a  large  data  base,  or 
a  large  set  of  problem-oriented  programs,  or  both.  Current  support  levels 
limit  us  to  attacking  either  a  relatively  small,  simpler  class  of  problems, 
or  a  specific,  limited  aspect  of  the  problem-solving  process.  We  are 
pursuing  the  latter  course,  as  described  in  the  following  paragraphs. 

One  aspect  of  problem  solving  that  is  common  to  a  wide  variety  of  problem¬ 
solving  situations,  and  that  appears  to  satisfy  the  above  criteria,  is  that 
of  the  value  judgments  that  enter  into  evaluating  and  comparing  proposed 
courses  of  action  (or  designs),  especially  those  that  affect  the  public  or 
require  extensive  funding  and  other  resources.  Man's  ability  to  evaluate 
alternative  courses  of  action  grows  increasingly  tenucus  as  the  number  of 
criteria  that  must  be  considered  increases.  The  difficulty  is  exacerbated 
when,  in  group  decision  making,  different  value  orientations  are  present 
and  must  somehow  be  reconciled.  These  value  orientations  are  implicit  iu 
both  cost  and  benefit  considerations,  but  there  is  nc  technique  for  defining 
them  with  sufficient  precision  that  they  can  be  properly  weighed  and 
included  in  cost/benefit  analyses.  As  a  result,  important  criteria  are 
often  excluded  from  consideration  in  decision  making. 

In  valu^  judgments,  then,  we  have  an  element  of  the  problem-solving  process 
that  is  complex  and  ill-defin'd  and  whose  exclusion  from  the  process  can 
result  in  solutions  that  are  judged  either  inadequate  or  wrong.  These 
conditions  satisfy  the  first  two  of  the  three  criteria  listed  above.  To 
attack  this  problem,  we  have  developed  a  set  of  methods  and  techniques 
(called  EVALUATION)  that  use  the  "fuzzy-set"  concept  in  the  man-machine 
context  (Rorraann,  SP-3590) .  Qualitative  (or  value-oriented)  information 
can  be  intermixed  with  quantitative  (or  factual)  information;  our  techniques 
will  assist  evaluators  in  manipulating  qualitative-quantitative  information 
systematically  and  consistently.  The  techniques  will  also  assist  in  group 
interaction  toward  a  final  decision  that  takes  different  value  orientations 
into  account.  Implementing  these  techniques  is  not  expected  to  require 
large  amounts  of  core  memory,  and  they  should  be  able  to  handle  a  fair-sized 
set  of  data  (e.g. ,  10  alternatives,  each  with  100  attributes);  we  expect, 
then,  that  the  third  criterion  will  bo  satisfied. 

The  many  potential  applications  of  these  techniques  that  have  been  identifieo 
include  evaluations  of: (a)  complex  equipment  with  many  performance  criteria 
(from  the  buyers'  points  of  view);  (b)  new  piod  .  ts  to  be  introduced 
into  specified  segments  of  the  market  (from  the  marketers'  points  of  view); 

(c)  proposals  submitted  by  potential  contractors  (from  the  agencies'  points 
of  view);  (d)  proposed  locations  for  large  complexes  (e.g.,  missile  bases, 
defense  centers,  airports,  health-care  centers,  and  many  more;  and  (e)  pro¬ 
posed  configurations  of  components  that  interact  (e.g.,  computer  hardware 
with  many  options  for  peripheral  equipment,  and  software  packages). 
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One  concrete  application*  within  area  (a)  is  the  selection  of  appropriate 
aircraft  designs  (for  a  given  set  of  mission  requirements)  from  alternative 
sets  of  performance  characteristics,  including  both  factual  information  and 
value- judgment  information. 

5.1.2  Flai 

Plan  9  for  the  ne:ct  six  months  include  continued  work  in  the  three  areas 
d^cribed — UAL  implementation,  foundational  work  toward  prototype  Gaku  devel¬ 
opment,  and  problem  applications.  To  alleviate  the  core-memory  problem,  UAL 
programs  will  be  tightened,  and  techniques  for  "swapping"segments  of  UAL 
and  Gaku  in  and  out  of  disc  and  tape  storage  will  be  explored.  Implementation 
of  the  EVALUATION  tool  will  be  started,  working  toward  creating  a  demonstrable 
prototype;  other  application  problems,  in  addition  to  the  aircraft  design 
example,  will  also  be  examined. 

5.2  TIME-SHARING 

The  ADEPT  time-sharing  executive**  functions  as  the  operating  system  for  SDC's 
ARPA  research  projects.  The  system  was  designed  for  the  IBM  360/50H  computer 
and  modified  to  run  on  the  IBM  360/671.  In  the  past,  a  number  of  system 
releases  were  produced  for  various  government  agencies,  and,  during  the 
previous  year,  the  system  served  as  an  experimental  basis  for  systems  research 
in  time-sharing,  networks,  natural  input/output ,  and  flexible  digital 
communications.  Communications  research  and  development  activities  center 
on  a  combined  hardware/software  system  develop^  1  to  flexibly  interface  a 
variety  of  terminals  and  display  devices  to  time-sharing  systems,  ADEPT  in 
particular.  The  hardware  consists  of  a  Honeywell  DDP-516  connected  through 
a  Honeywell-supplied  interface  with  the  multiplexor  channel  of  the  IBM  360. 

The  software  consists  of  a  Honeywell-supplied  programmed  multiline  controller 
(PMLC),  which  was  modified  and  expanded  to  serve  as  our  operating  executive 
program  in  the  DDP-516. 

During  the  reporting  period,  in  support  of  our  systems  research  goals,  the 
project  was  concerned  with  the  hardware/software  systems  design  and 
implementation  activities  that  relate  to  the  ADEPT-67  and  DDP-516  executives. 


Suggested  by  a  member  of  the  ARPA/AGILE  group. 

** 

Weissman,  Clark,  Clay  E.  Fox,  and  Richard  R.  Linde.  The  ADEPT-50  Time¬ 
sharing  System.  SDC  document  SP-3344.  August  1969. 
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5.2.1  Progress 

ADEPT  Transfer 


During  the  reporting  period,  id)EPT  was  transferred  from  the  IBM  360/50  to 
the  IBM  360/67  computer  (Linde,  N-(L)-24460) .  The  transfer  involved  the 
utilization  of  a  new  swap  device — a  2301  drum  rather  than  the  Model  50H 
2303  drum.  However,  during  this  reporting  period,  through  an  effort  to 
improve  the  efficiency  of  the  system,  a  major  problem  with  the  swap  device 
was  uncovered.  Through  the  use  of  our  system  benchmark  programs,*  it  was 
found  that  swaps  from  the  2301  were  25%  slower  than  those  from  the  2303 
because  the  timing  of  the  2301  channel  program  was  slower  than  anticipated; 
two  dummy  records  were  inserted  on  each  addressable  track  to  resolve  the 
timing  limitations.  Subsequent  analysis  has  shown  that  swap  speeds  are  four 
times  that  of  the  Model  50H  2303  drum,  which  is  the  expected  efficiency  gain. 

Large  Program  Supporc 

Because  of  the  increasing  demand  by  users  for  more  memory  (see  sections  3.1, 
4.1,  and  5.1),  a  second  IBM  360/67  release  began  daily  operation  on 
30  November  1970.  This  release,  ADEPT  8.8,  was  designed  to  make  use  of  the 
additionax  64  4096-byte  pages  of  core  memory  available  on  the  360/671 
(Linde,  N-(L)-24460) .  Core  was  apportioned  to  give  our  user  programs  a 
greater  amount  of  resident  core  and  to  bolt  portions  of  our  high-usage 
swappable  executive  programs.  On  the  Model  50H,  user  programs  could  grow 
only  as  large  as  46  pages;  ADEPT  8.8  will  support  85-page  user  programs. 

Network  Interface 


On  3  February  1971,  a  third  360/67  release  (ADEPT  8.9)  was  constructed  for 
system  testing  and  experimentation.  This  release  includes  the  system 
modules  to  interface  with  the  ARPA  Network  (see  section  4.1).  The  Systems 
Research  activity  was  involved  with  interfacing  these  components  to  ADEPT  8.8. 
In  the  main,  this  interface  involved  redesigning  our  batch  monitor  job  (a 
ghost  job  in  that  it  has  no  interactive  terminal  associated  with  it)  to 
support  an  executive  Network  Control  Program  and  an  object  process  that  runs 
in  supervisor  state.  Other  changes  involved  adding  a  supervisor-state  SVC 
call  and  an  executive,  internal  Load-and-Go  call  to  the  system. 

The  following  additional  changes  were  made  to  the  367/67  executive  portion 
of  the  system: 
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•  A  utility  monitor  function  to  process  /PRESTORE,  /PUNCH,  and  /PRINT 
commands  in  an  overlapped  mode. 

•  An  I/O  error-recording  facility  for  recoverable  errors.  This 
printout  occurs  on  the  operator’s  1052  terminal. 

•  A  utility  program  for  saving  and  restoring  ADEPT  file  structures 
on  backup  tapes.  This  program  is  under  control  of  the  operator 
and  runs  in  the  supervisor  state. 

•  A  statistics  program  was  written  to  give  a  measure  of  the  daily 
operating  systems  performance. 

Communications  Multiplexor 

Our  DDP-516  Programmable  Controller — programmed  ro  achieve  flexible  interfaces 
serves  as  our  communications  multiplexor.  As  part  of  our  continual  seeking 
of  more  cost-effective  improvements,  we  are  planning  to  use  an  ADS  (American 
Data  Systems)  multiplexor  connected  to  the  synchronous  single-line  controller 
on  a  direct  multiplex  channel  (DMC)  to  do  the  bit  search  and  character  build¬ 
ing  now  done  in  the  PMLC  (see  Figure  5-1) .  This  will  reduce  processor  over¬ 
head,  allowing  more — and  a  greater  variety  of — simultaneous  terminals. 

Using  a  300-baud  clock  and  a  110-baud  clock  interrupting  at  seven  times 
the  bit  rate,  we  are  now  getting  2,870  interrupts  pei.  second.  This  accounts 
for  much  of  the  high  (50%)  overhead  time  on  the  PMLC.  If  we  support 
600-baud  terminals  (which  is  under  serious  consideration  because  of  their 
improved  speed  and  performance) ,  the  number  of  interrupts  per  second  will 
increase  to  about  5,000,  which  is  an  unacceptable  overhead.  Using  the  ADS 
unit  running  at  11,040-baud  on  the  DMC,  we  expect  only  60-70  interrupts  per 
second  in  the  DDP-516,  and  it  will  not  do  the  character  building  and  bit 
shuffling  now  required.  The  time  and  storage  thus  freed  permit  expansion 
in  bc:h  communication  channels  and  baud  rates  that  can  be  supported. 

During  the  reporting  period,  a  program  was  written  to  gain  familiarity  with 
the  ADS  hardware  and  to  perform  a  diagnostic  analysis  of  the  hardware.  The 
routines  to  interface  with  the  PMLC  portion  of  the  DDP-516  executive  and  the 
ADS  multiplexor  were  designed,  coded,  and  assembled.  These  modules  have 
cycled  and  ar^  currently  being  debugged. 

Hardware  problems  with  the  ADS  synchronization  circuit  caused  considerable 
loss  of  time  on  the  input  side  of  the  integration  effort.  The  two  ASCII 
3ync  characters  were  inverted,  which  produced  one  sync-bit  pattern.  This 
problem  has  been  corrected.  An  error  in  the  Honeywell  manual  regarding  an 
End-of-Range  skip  instruction  also  caused  difficulty. 
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System  Reliability 

An  intensive  effort  was  made  to  improve  the  software  and  hardware  reliability 
of  the  ADEPT-67  system  during  this  period.  One  of  the  more  difficult  problems 
occurred  between  the  360-to-516  hardware  and  software  interface.  Commun4 * 
cation  between  the  IBM  360  and  the  DDP-516  can  be  initiated  by  either  3ide. 
However,  because  of  a  timing  gap  between  the  first  signal  of  initiation 
from  one  side  and  the  response  from  the  other  side,  both  sides  could 
initiate  communication  simultaneously.  Although  this  sequence  was  antici¬ 
pated  by  both  the  hardware  and  the  software  designers,  one  occurrence  of 
this  simultaneous  initiation  was  overlooked  by  Honeywell.  The  problem  has 
been  resolved,  after  an  extensive  investigation  of  hardware  and  software 
logic  (Peng,  N-(L)-24465) ,  through  a  one-wire  DDP-516  connection. 

5.2.2  Plans 


During  the  remainder  of  the  year,  a  number  of  activities  will  take  place  to 
extend  the  capabilities  of  the  ADEPT-67  time-sharing  system  (Linde, 
N-(L)-24453) .  They  are  as  follows: 

•  An  interface  to  library  functions  will  be  implemented  with  the 
ADEPT  F-level  assembler.  This  will  add  to  the  set  of  MACRO 
functions  available  to  ADEPT  and  OS/360  users. 

•  Completion  of  the  ADS  multiplexor  system  checkout  will  involve 
replacement  and  integration  of  portions  of  the  Honeywell  PMLC  with 
our  ADS  software  interface.  The  completion  of  this  activity  will 
provide  additional  core  storage  for  other  DDP-516  activities. 

•  In  order  to  enhance  system  reliability,  it  is  desirable  to  add  a 
DDP-516  memory-protection  capability.  Software  utilization  of  this 
hardware  will  follow  the  ADS  system  completion. 

•  An  activity  to  increase  the  number  of  ADEPT-67  interactive  jobs 
from  10  to  15  will  be  initiated  this  summer  upon  successful 
completion  of  the  ADS  multiplexor  system  checkout. 

•  Completion  of  basic  ARPA  Network  interfaces  will  involve  additional 
changes  to  the  ADEPT-67  system.  We  plan  to  add  more  non-conversa- 
tional  jobs  to  the  system  for  our  remote  HOST  users  and  to  enhance 
the  act  or  operator  Network  commands. 
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5.3  LISP  EXTENSIONS 

The  SDC  LISP  1.5  system  is  a  proprietary  SDC  product  written  in  1968.* 

The  system  operates  under  the  TS/DMS  time-sharing  executive  on  IBM  360 
computers.  SDC  has  made  the  I.ISP  system  available  (on  a  no-cost  basis)  to 
ARPA  for  many  years.  It  is  the  basic  product  that  was  modified  for  operation 
on  the  ADEPT  system.  The  principal  users  of  the  LISP  system  are  the  ARPA- 
sponsored  CONVERSE  and  Problem  Solving  and  Learning  by  Man-Machine  Teams 
projects  (sections  3.1  and  5.1,  respectively). 

Because  of  the  severe  demands  made  on  the  LISP  system,  a  number  of  specific 
extensions  and  improvements  to  LISP  are  being  made.  They  are: 

1.  A  feature  to  allow  LISP  to  grow  and  take  advantage  of  the  greater 
core  space  made  available  to  users  under  ADEPT  8.8  (see  section  5.2). 

2.  A  TRY  and  EXIT  mechanism  to  allow  LISP  programs  to  field  their 
own  errors. 

3.  A  mechanism  to  allow  portions  of  LISP’s  binary  program  space  to  be 
unloaded  onto  disc,  giving  more  space  in  core  for  data. 

4.  An  infix  translator  to  extend  LISP  syntax. 

5.  Significant  improvement  to  the  capabilities  and  user  conventions 

of  LISP  interaction,  via  the  time-sharing  executive,  with  peripheral 
input /output  devices. 

6.  Improvements  to  the  LISP  edit  program,  LISPED. 

7.  Rewriting  the  primitive  LISP  functions  CAR,  CDR,  ATOM,  etc., 
to  increase  speed  and  efficiency. 

8.  Writing  a  Core  Image  Generator,  which  would  run  under  the  ADEPT 
time-sharing  executive  to  allow  regeneration  of  a  brand  new  LISP, 
thereby  gaining  more  core  space  by  cleaning  up  certain  patch  areas. 
The  existence  of  the  Core  Image  Generator  would  also  make  it 
possible  to  create  specialized  copies  or  versions  of  the  LISP 
system  for  special  purposes. 

9.  Creating  an  expanded  set  of  documents  describing  the  LISP  system 
from  the  point  of  view  of  the  user  and  the  system  engineer. 


Barne'it,  Jeffrey  A.,  and  Robert  E.  Long.  The  SDC  LISP  1.5  System  for 
IBM  360  Computers.  SDC  document  SP-3043.  January  1968. 


15  April  1971 


System  Development  Corporation 
TM-3628/008/00 


57 


5.3.1  Progress 

The  improvements  listed  in  items  1  through  6  above  have  been  written,  tested, 
and  checked  out  to  the  satisfaction  of  the  principal  users.  Other  ARPA- 
sponsored  projects  that  occasionally  use  LISP,  such  as  che  Voice  Input/Output 
project  (see  section  3.3)  and  the  Networks  project  (uee  section  4.1),  also 
report  satisfaction  with  the  improvements. 

With  the  advent  of  Release  8.8  of  ADEPT,  the  LISP  system  could  grow  from  46 
to  85  pages.  Although  this  is  only  a  46%  increase,  the  LISP  system  code  is 
now  well  contained  in  36  pages,  which  means  that  the  amount  of  data  space 
available  to  users  has  quadrupled,  from  11.5  to  47  pages.  That  is  a  dramatic 
jump,  and  has  significantly  improved  LISP  performance  by  lowering  the  fre¬ 
quency  of  garbage  collection  overhead.  No  further  real-memory  increases  are 
now  possible  with  SDC  LISP,  since  we  are  at  the  limits  of  physical  resources. 
In  attempting  future  expansions  of  memory,  we  will  have  to  explore  virtual 
storage  and  paging  concepts. 

No  serious  technical  problems  have  been  encountered  in  making  the  improve¬ 
ments  listed  above.  However,  users  have  reported  one  or  two  occasions  of 
running  out  of  program  reference  space.  This  seems  to  be  caused  b'  the 
increase  in  binary  program  space  (for  compiling  additional  code)  g  n  the 
user  by  both  the  GROW  feature  and  the  feature  that  swaps  binary  code  to  a 
disc.  Each  new  function  takes  an  additional  word  in  program  reference  space. 
This  problem  will  be  self  correcting  if  and  when  the  Core  Image  Generator  is 
written  and  the  entire  system  thus  regenerated. 

The  LISP  Extensions  activity  is  a  cooperative  effort  of  ARPA  projects.  Since 
work  is  done  by  members  of  SDC’s  LISP  user  community,  it  can  continue  only  at 
a  rate  determined  by  the  amount  of  cime  the  users  can  spare  from  their 
primary  tasks. 

5.3.2  Plans 

Three  remaining  tasks  (7,  8,  and  9  above)  still  remain  to  be  done: 
improvement  of  the  primitive  functions  CAR,  CDR,  ATOM,  etc.;  Core  Image 
Generation;  and  documentation.  Because  cf  the  nature  of  the  staffing  of 
this  activity,  completion  of  these  tasks  is  subject  to  project  needs. 
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The  nature  of  this  task  is  such  that  it  does  not  require  full-time  activity, 
but  the  part-time  services  of  a  number  of  people.  Only  the  principals 
are  noted  here. 


